Team Farzad

Introduction

Hello there! We are a ragtag team of 5 people who have joined forces to solve challenge 3 by Julius Bär Group. We came together through the common ground of cybersecurity to build an effective file scraper that can be utilised to identify sensitive data present.

Inspiration

With cybersecurity being a very experience-focused discipline and our status as relative newcomers in this field, we hoped to make a dent with our solution that shows the clear road ahead to a full-fledged tool to help Julius Bär. We aimed to make a minimum viable product that would meet the expectations of a major organisation to help with meeting regulatory demands.

How we built it

Prior to coding our solution, we made sure to spend time researching what led each file type to be classified as sensitive or not to inform a better solution that worked for the criteria that Julius Bär is comfortable with. The main pipeline behind this project is an initial ID of the file based on the file signature (far more reliable than the file extension!) The next stage was to correctly prepare files for processing by extracting the text contents from these files. Examples of this included using OCR to extract text from image files, using the BeautifulSoup library to correctly parse HTML or using pydub to even process mp3 audio files to text that can be examined! With the text data available, we moved this content to be processed by our pre-trained machine learning model that had learnt to identify pertinent factors within text that suggests priviledged information.

Challenges we ran into

Our main challenges were to do with compatibility, libraries being inaccessible, docker daemons not starting and the final herculean task of integrating all of our code together.

Accomplishments that we're proud of

Throughout these (mostly) continuous 40 hours, we have learned how a task at a distance seems easy, but upon closer inspection, paranoia leads to realisation and the understanding of the many situations to prepare for when you want to publish code for an entropic world.

What we learned

We learnt the value of leaning on each other for help, sharing our opinions and keeping a general air of motivation about us that we would produce work we would be proud of.

What's next for Farzad

At this stage, we see our code as being a great proof of concept for an idea that can be in genuine usage by Julius Bär in the very near future.

Built With

Share this project:

Updates