Identifying data within financial statements with a machine-learning optical character reader

Loading...
Thumbnail Image
Authors
Babusiak, Ryland
Advisor
Ergin, Huseyin
Issue Date
2020-05
Keyword
Degree
Thesis (B.?)
Department
Honors College
Other Identifiers
CardCat URL
Abstract

In order to fulfill the desires of my team’s clients for the CS498 capstone project, I needed to design a software that could look at an image of a financial statement and create a list of the assets shown. This functionality was well-suited to a style of software project called a library. This library was created to provide a simple set of commands that summarize a far more complicated process of interacting with Amazon Web Service Textract machine-learning character recognition software. In addition, the library provides a useful set of tools for interacting with the dataset after AWS Textract has performed its analysis. Included with the library is a simple web application server that mimics the same use-case as the capstone project, providing an example of how to implement the library effectively.