Thursday, 12 March 2020

Using the Data Miner Chrome extension to collate Google results


This last week I ran a search for a rapid review which was particularly grey literature heavy. As a result, I ended up with several domain-specific search strings for Google. Originally, I thought I could either copy the first 100 results into a Word document for screening, or screen the results for relevance myself. However, the former is not the most user friendly, and the latter would inevitably introduce some bias. Additionally, speed was key for this rapid review!

I decided to look into how I could possibly pull all the required information from the Google results, in a way that would enable the reviewer to efficiently screen themselves. My colleagues know that I LOVE a Chrome extension, and now I have found yet another one in the Data Miner Scraper.

Data Miner is a really handy tool which enables you to quickly scrape the information you want from any webpage. They have a bunch of public “Recipes” available which can pull off a range of information depending on your needs (under the "Public" tab). In this case, I simply needed the URL, title, and summary info (effectively what you would get as you’re scrolling through the results normally). I found a recipe that would do this, and simply ran it on each of the pages from my searches.

You can then export the results into Excel. Once I had it in this format it was also really easy to de-duplicate the results using the URLs. So in the end, I had a much more user-friendly format, with de-duplicated results, which would enable my colleague to screen the results more easily.

I am yet to look into whether this could be uploaded to a reference manager so the results could all be screened in one place (seems feasible?). However, for now I’m glad I’ve found Data Miner and I know I’ll be using it in more searches in the future.


  • You can download Data Miner here (big bonus was that I didn’t need permission from IT!).

  • They also have a YouTube channel with lots of tutorials which can be accessed here.


No comments:

Post a Comment

Note: only a member of this blog may post a comment.