Thursday 12 March 2020

Using the Data Miner Chrome extension to collate Google results


This last week I ran a search for a rapid review which was particularly grey literature heavy. As a result, I ended up with several domain-specific search strings for Google. Originally, I thought I could either copy the first 100 results into a Word document for screening, or screen the results for relevance myself. However, the former is not the most user friendly, and the latter would inevitably introduce some bias. Additionally, speed was key for this rapid review!

I decided to look into how I could possibly pull all the required information from the Google results, in a way that would enable the reviewer to efficiently screen themselves. My colleagues know that I LOVE a Chrome extension, and now I have found yet another one in the Data Miner Scraper.

Data Miner is a really handy tool which enables you to quickly scrape the information you want from any webpage. They have a bunch of public “Recipes” available which can pull off a range of information depending on your needs (under the "Public" tab). In this case, I simply needed the URL, title, and summary info (effectively what you would get as you’re scrolling through the results normally). I found a recipe that would do this, and simply ran it on each of the pages from my searches.

You can then export the results into Excel. Once I had it in this format it was also really easy to de-duplicate the results using the URLs. So in the end, I had a much more user-friendly format, with de-duplicated results, which would enable my colleague to screen the results more easily.

I am yet to look into whether this could be uploaded to a reference manager so the results could all be screened in one place (seems feasible?). However, for now I’m glad I’ve found Data Miner and I know I’ll be using it in more searches in the future.


  • You can download Data Miner here (big bonus was that I didn’t need permission from IT!).

  • They also have a YouTube channel with lots of tutorials which can be accessed here.


COVID-19 resources

Our colleague Keith Nockels at University of Leicester is putting together information on COVID-19 at his blog currently. Take a look here: http://browsing.blogspot.com/2020/01/outbreak-of-novel-coronavirus.html

Using Global Index Medicus in systematic reviews


I recently had cause to search Global Index Medicus (GIM) from the World Health Organisation (WHO) as a part of an international systematic review that I’m working on. The team specifically needed to search databases that covered low and middle income countries (LMICs). Global Index Medicus covers five regions, Africa Index Medicus (AIM), Index Medicus for the Eastern Mediterranean Region (IMEMR), Index Medicus for the South-East Asia Region (IMSEAR), Latin American and the Caribbean Literature of Health Sciences (LILACS), and Western Pacific Region Index Medicus (WPRO). Through the WHO interface it is possible to search all of these databases combined, or individually.

When it came to constructing the search strategy for the systematic review, I’d never used GIM before. I had a look around for any guides to searching the database effectively and came up blank.

Through some experimentation, I found that an advanced search allowed me to search using MeSH descriptors, and having already built my search in Medline, that meant straightforward transposing of the terms should be possible. However, the advanced search did not allow me to build a strategy in the same way I would use other databases as the search lines were not numbered. So, to get around this issue, I searched for all of the MeSH terms and keywords for each concept, and combined with the OR operator. Once that search was run, GIM gave me a single line search in the search box, which I copied to a Word document. I repeated this process for each concept in my search strategy and was able to come up with three single lines of search string combined with the OP operator, that the database then allowed me to combine using the Advanced Search feature with the AND operator. It was a lengthy search strategy and a lengthy process!

What I discovered:
  • Adjacency operators did not seem to work, or at least, ADJ and NEAR were not recognised by GIM.
  • It is possible to select and download references within the search results
  • The best way to view all results was by downloading the .csv file and using Excel to read (which is often what the systematic reviewers I work with want to see) – there was no limit of numbers and I was able to download 3000+ results in a single file
  • Downloading .ris files for reference management software could only be done in batches of 100 (so quite a time consuming process)
  • Keep a copy of your strategy safe so you can re-run it immediately prior to submission to your journal of choice

If you want a real global slant on your systematic review, it’s a fantastic resource that’s free to use, it will just take a little bit of time to negotiate.