A number of data sets are useful for researchers studying the science of science funding.  A short list is provided here:

Dimensions:  https://www.dimensions.ai/

Researchers can access the free Dimensions application covering 97 million publications, contextualized with grants, patents and clinical trials by visiting https://app.dimensions.ai/.

Lens:  https://www.lens.org/

  • Lens hosts most of the world's patent information and also scholarly literature (like PUBMED, CrossRef and Microsoft Academic), creating open public innovation portfolios of individuals and institutions. 

Risis:  http://datasets.risis.eu/

  • The EU funded RISIS covers data sets on public sector research, research careers, and a repository on research and innovation policy evaluations.  It includes inter alia:   EUPRO dataset comprises information on R&D projects and all participating organizations funded by the European Framework Programmes (FP); PROFILE is a longitudinal, multi-cohort panel study focusing on the situation of doctoral candidates and their postdoctoral professional careers. The sample consists of doctoral candidates at universities and funding organizations in Germany; The Science and Innovation Policy Evaluation Repository (SIPER) is a database consisting of science and innovation policy evaluations from across the world;  The RISIS-ETER facility is a set of databases providing a register of European Higher Education Institutions and containing basis statistical information on them, including descriptors, geographical information, students and graduates, personnel, finances, and research activities;The CWTS Leiden Ranking is a database of a university ranking focusing on output and impact of research. 

Marx/Fuegi patent-to-paper linkages:  http://relianceonscience.org

  • We link worldwide patents to scientific literature, harvesting more than 30 million references from the front pages and body text of patents. Linkages are available for the Microsoft Academic Graph as well as PubMed. Each linkage has an applicant/examiner indicator as well as a confidence score.

Data on federally-funded patents

Patent text data and code for calculating text-based similarity between any two utility patents granted by the USPTO between 1976 and 2013, or between any two patent portfolios

For more information, see Arts, Sam, Bruno Cassiman, and Juan Carlos Gomex.  "Text matching to measure patent similarity." Strategic Management Journal 39, no. 1 (2018): 62-84