BiverWordle is an innovative visualization system specifically designed to analyze, visualize, and authenticate sentiments relating to stock trends within financial text forums. This system ingeniously merges a time-series financial data representation, which is inspired by a candlestick chart, with a stacked area chart/streamgraph derived from text classification results. It further integrates word clouds that are uniquely arranged in the shapes of sad, neutral, and happy smiley face ideograms. These combined elements provide a comprehensive and intuitive portrayal of stock market sentiment.
The name "BiverWordle" encapsulates the essence of our approach, alluding to two river-like area charts flowing with words. Our system's effectiveness is evaluated through various measures. These include text classification metrics (accuracy, precision, recall, and F1-score) for test data, and additionally, three use cases that vividly demonstrate the interactive data exploration possibilities of our visualization tool.
The paper has been accepted at VINCI 2023 as a short paper!
- Integration of multiple visualization techniques for a comprehensive understanding of stock market sentiment.
- Analysis of Chinese financial text data to understand and visualize market trends.
- Validation of our system using three different stock trend scenarios.
- Exploration of potential improvements in both the financial text classification and visualization aspects of the project.
- Text Classification: BiverWordle employs text classification algorithms, such as textCNN and BERT, to analyze and categorize financial text data from forums.
- Visualization: BiverWordle combines various visualization techniques, including K-Line diagrams, Theme River, and Word Cloud, to represent stock trends and sentiments.
-
Data Source: The data source is from 互动易, provided by Prof. LI Yuelei, Tianjin University. We identified eight categories within the data:
Index Category Index Category 0 宏观政策 (Macroeconomic Policy) 4 产品及服务 (Products and Services) 1 业务动态 (Business Dynamics) 5 股票情况及政策 (Stock Status and Policy) 2 收益能力 (Earning Ability) 6 股票价格波动 (Stock Price and Volatility) 3 财务情况 (Financial Situation) 7 其他 (Others) -
Checkpoints: The checkpoints are available in Google Drive
The dataset is partitioned into 11,915 texts for training, 3,910 for testing, and 3,910 for validation, adhering to a 6:2:2 ratio. We use three models: textCNN, BERT, and Voting. Here are the experiment results for these models:
Model | Category | Precision | Recall | F1-score |
---|---|---|---|---|
textCNN | 宏观政策 (Macroeconomic Policy) | 0.614 | 0.433 | 0.508 |
textCNN | 业务动态 (Business Dynamics) | 0.598 | 0.671 | 0.632 |
textCNN | 收益能力 (Earning Ability) | 0.542 | 0.522 | 0.532 |
textCNN | 财务情况 (Financial Situation) | 0.748 | 0.593 | 0.661 |
textCNN | 产品及服务 (Products and Services) | 0.651 | 0.736 | 0.691 |
textCNN | 股票情况及政策 (Stock Status and Policy) | 0.676 | 0.686 | 0.681 |
textCNN | 股票价格波动 (Stock Price and Volatility) | 0.611 | 0.627 | 0.619 |
textCNN | 其他 (Others) | 0.556 | 0.504 | 0.529 |
BERT | 宏观政策 (Macroeconomic Policy) | 0.526 | 0.514 | 0.520 |
BERT | 业务动态 (Business Dynamics) | 0.609 | 0.650 | 0.629 |
BERT | 收益能力 (Earning Ability) | 0.459 | 0.659 | 0.541 |
BERT | 财务情况 (Financial Situation) | 0.669 | 0.605 | 0.636 |
BERT | 产品及服务 (Products and Services) | 0.620 | 0.712 | 0.662 |
BERT | 股票情况及政策 (Stock Status and Policy) | 0.761 | 0.626 | 0.687 |
BERT | 股票价格波动 (Stock Price and Volatility) | 0.578 | 0.650 | 0.612 |
BERT | 其他 (Others) | 0.611 | 0.410 | 0.491 |
Voting | 宏观政策 (Macroeconomic Policy) | 0.607 | 0.491 | 0.543 |
Voting | 业务动态 (Business Dynamics) | 0.620 | 0.697 | 0.656 |
Voting | 收益能力 (Earning Ability) | 0.530 | 0.622 | 0.572 |
Voting | 财务情况 (Financial Situation) | 0.717 | 0.622 | 0.666 |
Voting | 产品及服务 (Products and Services) | 0.649 | 0.758 | 0.699 |
Voting | 股票情况及政策 (Stock Status and Policy) | 0.742 | 0.660 | 0.699 |
Voting | 股票价格波动 (Stock Price and Volatility) | 0.617 | 0.648 | 0.643 |
Voting | 其他 (Others) | 0.628 | 0.479 | 0.543 |
The Accuracy of Each Model:
Model | Accuracy |
---|---|
textCNN | 0.624 |
Bert | 0.612 |
Voting | 0.642 |
Incorporating G2 into BiverWordle allows you to create interactive and visually appealing charts to enhance your analysis and presentation of stock market sentiment. G2 is a powerful visualization library that provides a wide range of chart types and customization options.
To start using G2 in BiverWordle, follow these steps:
- Clone the BiverWordle repository from GitHub.
- Install the required dependencies by running the command
npm install
oryarn install
. - Set up the necessary configurations, such as API keys or database connections, as specified in the project documentation.
- Utilize the G2 library to create your desired charts and visualizations. Refer to the G2 documentation for more details and examples on how to utilize this powerful visualization library.
For more information about G2, visit the G2 GitHub repository.
To start using BiverWordle, follow these steps:
- Clone the repository:
git clone https://github.com/Brian-Lei-XIA/BiverWordle.git
- Navigate to the project directory:
cd BiverWordle
- Install the required dependencies:
pip install -r requirements.txt
- Navigate to the code directory:
cd ./code/FinancialVis
- Start the system using Djongo:
python manage.py runserver
Make sure you have Python and Djongo installed on your system before running the above commands. You can find more information about installing Djongo in the Djongo documentation.
Contributions to BiverWordle are welcome! If you would like to contribute to the project, please follow these steps:
- Fork the BiverWordle repository on GitHub.
- Create a new branch for your feature or bug fix.
- Make the necessary changes and commit your code.
- Push your changes to your forked repository.
- Submit a pull request to the main BiverWordle repository.
We plan to further enhance our system by:
- Employing specialist annotators to redefine tag categories and provide specialized data tagging.
- Exploring additional classification methods and experimenting with new sentiment analysis algorithms.
- Building a financial category-specific corpus for sentiment analysis.
- Integrating new visualization techniques and advanced features to improve the overall effectiveness and intuitiveness of our system.
We would like to thank the contributors and maintainers of the libraries and frameworks used in this project, as well as the financial text forums and datasets that provided valuable data for analysis.
For any inquiries or questions, please contact the BiverWordle team at brianleixia@connect.hku.hk.
Enjoy using BiverWordle and happy visualizing!
If you find our paper&tool interesting and useful, please feel free to give us a star and cite us through:
@inproceedings{10.1145/3615522.3615541,
author = {Xia, Lei and Gao, Yi-Ping and Lin, Le and Chen, Yu-Xi and Zhang, Kang},
title = {BiverWordle: Visualizing Stock Market Sentiment with Financial Text Data and Trends},
booktitle = {Proceedings of the 16th International Symposium on Visual Information Communication and Interaction},
year = {2023},
url = {https://doi.org/10.1145/3615522.3615541},
doi = {10.1145/3615522.3615541},
}
BiverWordle is released under the MIT License.