The DatabasesThe LensPreviously known as the Patent Lens this is a well designed site with quite a few visualisation options and access to sequence data. It is possible to search the title, abstract, description and claims of patent documents and create and share data in collections. In 2015 the ability to download up to 10,000 records at a time was added. When combined with interactive charts that allow the user to drill down into results set, this has transformed the Lens into a very useful and innovative database and visualization tool.

PatentscopeThe WIPO Patentscope database provides access to Patent Cooperation Treaty data including downloads of a selection of fields (up to 10,000 records), a very useful search expansion translation tool, and translation.

Obtaining sequence data from Patentscope. Note that this rapidly becomes gigabytes of data.

espacenetProbably the best known free patent database from the European Patent Office.

LATIPATFor readers in Latin America (or Spain & Portugal) LATIPAT is a very useful resource.

EPO Open Patent ServicesAccess patent data through the EPO Application Programming Interface (API) free of charge. Requires programming knowledge.

The developer portal allows you to test your API queries and is recommended.

USPTO Patents ViewThe Patents View for free searches and USPTO patent databases may be archaic but you can download the entire US collection from the Google USPTO Bulk download service.

It is a fantastic service, and an example to patent offices everywhere on freeing up patent data. If you have a good broadband connection and the hard drive space, it is quite good fun to suddenly have access to millions of patent records. The authors used the service to text mine the collection for millions of biological species names as reported here.

However, one important issue to note is that the XML delimiting individual documents is not always well demarcated. This means that any code that will work for one bulk set of files may fail on another set. While it is possible to address this, be prepared to spend time working on this and/or seek assistance from a professional programmer. For an insight into these issues see this Stackoverflow discussion on parsing the data in R.
Free Patents OnlineSign up for a free account for enhanced access and to save and download data. It has been around quite a while now and while the download options are limited we rather like it.

DEPATISnetWe are not covering national databases. However, the patent database of the German Patent and Trademark Office struck us as potentially very useful. It allows for searches in English and German and has extensive coverage of international patent data, including the China, EP, US and PCT collections. The coverage details are here. Worth experimenting with.

OECD Patent DatabasesOne that is more for patent statisticians. The OECD has invested a lot of effort into developing patent indicators and resources including citations, the Harmonised Applicants names database HAN database, mapping through the REGPAT database among other resources that are available free of charge.

Along the same lines the US National Bureau of Economic Research NBER US Patent Citations Data File is an important resource.
EPO World Patent Statistical DatabaseThe most important database for statistical use is the EPO World Patent Statistical Database (PATSTAT) and contains around 90 million records. PATSTAT is not free and costs 1250 Euro for a year (two editions) or 630 Euro for a single edition. The main barrier to using PATSTAT is the need to run and maintain a +200 Gigabyte database. However, there is also an online version of PATSTAT that is free for the first two months if you wish to try it by signing up for the trial (knowledge of SQL required).

For users seeking to load PATSTAT into a MySQL database Simone Mainardi provides the following code on Github.
Other data sourcesA number of companies provide access to patent data, typically with tiered access depending on your needs and budget. Examples include Thomson Innovation, Questel Orbit, STN, and PatBase. We will not be focusing on these services but we will look at the use of data tools to work with data from services such as Thomson Innovation.
For more information on free and commercial data providers try the excellent Patent Information User Group and its list of Patent Databases from Tom Wolff and Robert Austin.

Also worth mentioning is the Landon IP Intellogist blog which maintains Search System Reports
