Broadly speaking, my research interests fall into four umbrellas: financial text analytics, intelligent finance models, knowledge discovery and engineering, and green finance. Under each umbrella there are a couple of projects, some of which are, in fact, led by my students and collaborators.

Financial text analytics

Text consists of the majority of alternative data. Analyzing the versatile financial text coming from different sources, including corporate disclosures, annual reports, earning calls, financial news and social media, however, presents challenges. This thread of research designs NLP methods and investigates their adaptation to the finance domain.

The following research questions are to be addressed:

  • How to improve the accuracy, informativeness, and explainability of TABFSA (targeted aspect-based financial sentiment analysis)?
  • What linguistic cues and language styles are poorly captured by current machine learning and lexicon based NLP methods?
  • How to identify noise and fake news in financial markets?


Intelligent finance models

This stream of research is dual to financial text analytics: we live in a world where financial services and markets are digitalizing, providing an unprecedented amount of information to support decision-making. Traditional models in finance, developed decades ago, has overlooked such unstructured information. By leveraging on the methods developed to monitor public moods and discussions. This new information can be integrated to the financial forecasting and investing models to help improve them. Ongoing projects in this stream are:

  • How risk aversion coefficient can be estimated from text and better support customized portfolio management?
  • How different economic periods can be identified from public and private communications?


Knowledge discovery and engineering

Understanding financial texts requires dealing with a lot of numbers, time expressions, and contextualized arguments, many of which are not possible without external references. Knowledge bases are often by-products of data analysis, but useful and fundamental for future ambitions if well-organized. Today, most of NLP researches benifit, directly or indirectly, from early efforts such as WordNet and DBpedia. In that sense, knowledge is both the purpose and the instrument (ct. Sukhomlinskii).

Completed studies and IT artifacts under this umbrella include the business classification system built for the Chinese National Equities Exchange and Quotations (NEEQ) market, and a domain-specific financial sentiment lexicon.


Green finance

Certain economic activities and developments have negative externality: a typical example is environmental pollution, and climate change, when it becomes a global problem. The way how we use technology makes a huge difference. The first wave of IT, for instance office automation, had many good environmental impacts. But with the recent prevalence of AI, we should be cautious about low efficiency or even destructive computing. Sometimes, unregulated use can turn a powerful invention into an existential threat. Here are some questions of concern:

  • Can text mining be used to discover and support greener and innovative ideas?
  • What kind of IT development and infrastructure would promote ESG investing and financing?