Inspiration
We were inspired by the difficulties of analyzing data on topics that did not have well established datasets. This process can be arduous and expensive, with little framework support. Thus, we developed AlphaMine.
What it does
AlphaMine is an open source framework that compiles clean and comprehensive for Machine Learning training and validation. Simply by entering the type of data and classes, datasets are generated. Currently, we have functionalities to generate datasets for images (Computer Vision) and text (Natural Language Processing).
How we built it
We used python and implemented a series of libraries including Selenium, Beautiful-Soup, PIL, and opencv.
Challenges we ran into
We faced issues with source control, which we resolved using established programming methods and requirements.txt and code reviews. We also faced issues with saving online images to our local directory, which we solved using error handling.
Accomplishments that we're proud of
We are proud of building an open source, accessible framework that facilitates easier development. By using AlphaMine, users can gain access to new datasets which were previously unfeasible.
What we learned
We learned how to develop robust frameworks that follow modular principles. Additionally, we learned how to work cohesively to optimize efficiency and balance our strengths and weaknesses.
What's next for AlphaMine
Given AlphaMine's scalability, we will expand by adding additional forms of media including video, audio, and time series. Additionally, we will add more preprocessing features for data cleanup. Eventually, we plan to deploy our framework on pip.

Log in or sign up for Devpost to join the conversation.