Inspiration

We were inspired by the difficulties of analyzing data on topics that did not have well established datasets. This process can be arduous and expensive, with little framework support. Thus, we developed AlphaMine.

What it does

AlphaMine is an open source framework that compiles clean and comprehensive for Machine Learning training and validation. Simply by entering the type of data and classes, datasets are generated. Currently, we have functionalities to generate datasets for images (Computer Vision) and text (Natural Language Processing).

How we built it

We used python and implemented a series of libraries including Selenium, Beautiful-Soup, PIL, and opencv.

Challenges we ran into

We faced issues with source control, which we resolved using established programming methods and requirements.txt and code reviews. We also faced issues with saving online images to our local directory, which we solved using error handling.

Accomplishments that we're proud of

We are proud of building an open source, accessible framework that facilitates easier development. By using AlphaMine, users can gain access to new datasets which were previously unfeasible.

What we learned

We learned how to develop robust frameworks that follow modular principles. Additionally, we learned how to work cohesively to optimize efficiency and balance our strengths and weaknesses.

What's next for AlphaMine

Given AlphaMine's scalability, we will expand by adding additional forms of media including video, audio, and time series. Additionally, we will add more preprocessing features for data cleanup. Eventually, we plan to deploy our framework on pip.

Share this project:

Updates