Neel Shah Data Analyst and Business Analyst

Research trends of AI in India


The goal of this data analysis is to identify trends in ongoing research at Indian universities and companies. It also provides a quick glimpse of top authors and research papers. To put this in perspective, we compare this data to its global equivalent, which reveals some surprising differences in research trends.

Note: All data is related to computer science field only.

Created by:

Neel Shah Website Linkedin GitHub Email:

Guided by:

1) Malaikannan Sankarasubbu Linkedin GitHub
2) Dr. Jacob Minz Linkedin GitHub
3) Anirban Santara Linkedin GitHub

Technical Implementation - Open Source license

This analysis was implemented in a Jupyter notebook running on the Anaconda distribution of Python 3.6+. We used the SCOPUS journal dataset to examine papers and research conducted by Indian authors within India. This dataset details 1387 papers published between 2001 and 2016. We used the arXiv dataset with 24700+ papers for global data. The Jupyter notebook (code) and arXiv dataset are available for free under the MIT open source license. However, the SCOPUS journal dataset is not available under an open source license.

About the SCOPUS journal:

About arXiv

arXiv is an e-print service in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance and statistics. Submissions to arXiv should conform to Cornell University academic standards. arXiv is owned and operated by Cornell University, a private not-for-profit educational institution. arXiv is funded by Cornell University Library, the Simons Foundation, and member institutions. [Ref]

Results of data analysis:

This first pie-chart shows that 14.42% of research is done by the industry, compared to 85.58% at universities. It is a little bit surprising that in India the ratio of industry and university research is not very good compared to other countries.

The following pie-chart examines industrial research. Almost 70% of the research is at non-Indian companies’ headquarters in India. Google and IBM published almost 62% of all industry research publications, while there is only one Indian company in the top 10 – TCS with 13% of all publications.

For a country with more than 129 deemed universities, 67 institutions of national importance (public), 700 degree-granting Institutions, 35,539 affiliated colleges (public and private), the chart below raises questions about the quality of education. The top 15 universities contribute to almost 42% of all research publications. A substantial 7.5% of publications are at IISc Banglore alone. Even IIT Kharagpur, known as the research hub of the Indian IT sector, ranks 7th with 2.86% of publications.

Currently, there is a boom in Artificial Intelligence, Machine Learning, and Deep Learning and the bar graph below shows the popularity of research in these areas from 2001 to 2016 in India. It also shows that publications in AI shows a “zig-zag” pattern, likely due to complexity of research and lack of financial support in India.

We compare this data to global trends. The disparity in growth rates is very apparent.

The top words used in the titles of arXiv papers are shown below.

The top 20 Indian authors based on citation number are shown below. (Multiple persons indicate a tie.)

Top researchers from around the globe:

The graphs below shows Indian research trends for each year from 2010 to 2016:








Total number of publications by top 5 countries and india from 2001 to 2016:

About the datasets

The Jupyter notebook is available under MIT open source license. If you want to use this code, feel free to do so and cite me.


Research trend of AI in India from Neel Shah