About Me

I’m Sanjaya Subedi, software developer from Nepal. I have completed my Bachelors of Science in Computer Science and Information Technology from Tribhuvan University, Nepal .My major was Data Mining and Data Warehousing. Currently I’m studying Masters in Distributed Software Systems in Technische Universität Darmstadt, Germany. I have much interest in AI, Data mining, Natural Language Processing and Domain Specific Language design and implementation. My professional career started from Yomari Private Limited, Nepal. There, I had the opportunity to work in the field of Retail Data Warehousing and Business Intelligence. Since 2013, I have been a freelancer and it has been the most exciting and challenging part of my professional life. I had the opportunity to meet people from different parts of the world and help with their unique problems – from building recommender systems to designing and implementing a Javascript-like scripting language.


Following are the projects that I have done

  • Nepali Text generation using Deep Neural Networks
  • Researched and developed stemming algorithm for Nepali language, which was able to remove suffix from the words. (Download pdf report: Text Stemming in Nepali)

  • Development of Nepali text processing library which includes Stemming, Tokenization, Character length count, and Stopwords removal features. Available as Java library or Rapidminer extension.
  • Research and implementation of Vector Space Model based CAPTCHA solving algorithm which was capable of breaking visual CAPTCHAs of more than four web platforms (guestbooks and forums) with more than 95% accuracy.
  • A Recommender system based on hierarchical clustering and frequent pattern mining algorithm to recommend new retailers with marketing strategies that similar other retailers in the same area adopted.
  • A Neural Network system combined with Genetic Algorithm to predict the market direction of NASDAQ.
  • Design and implementation of a simple yet powerful web automation scripting language called ScrapeLang using ANTLR and C#.
  • An editor with syntax highlighting and script execution facility called ScrapeLang Editor. It was built using C#, AvalonEdit and Prism framework. The editor is very modular and extensible supporting plugins, designed and built by following the popular MVVM pattern. The user interface was built using WPF.
  • Design and implementation of Javascript-like scripting language using ANTLR and Actionscript.
  • Analysis on data management in All Nepal Football Association (ANFA) and prediction of football matches of English Premier League (EPL) using Logistic Regression and Hidden Markov Model. (Group project. Download pdf report ANFA Database and Prediction System)
  • Numerous web bots and crawlers using scrapy, python-requests, and selenium.

Programming Languages and Skills

I am most skilled in C# and Python. I have many years of experience with building modular and extensible applications using C#, WPF and Prism library. I use Python for building web bots, scrapers with scrapy and selenium, image processing applications with PIL and language processing applications with NLTK.

Besides those two languages that I use frequently, I am also familiar with Java, R, Javascript, Actionscript, and PHP and have done a few projects using them.

I have a strong background in database systems and SQL. I have experience with Oracle database 11g, SQL Server 2008 R2, MySQL, SQLite, and MS Access database systems. I also have experience with SSIS and Oracle Data Integrator for building data loading packages.

For machine learning and data mining, my favorite tool/language is Python. I have also used R, Rapidminer and Weka for my personal projects. R is quickly grabbing my attention and it might be my favorite language for data analysis and data mining.