Inspiration

Can we convert text knowledge into a video or some sort of visual knowledge? This was the main idea behind "Graph Cinema".

There are people complain about young generations. They simply blame young people by saying "they don't read, they just spend time on social media". Yes, this is fairly true. People don't read, because reading might be a boring. Instead, images and videos are much more engaging and exciting. That's why people tend to watch videos and images instead of reading a book. Also there is a saying "an image worth a thousand words". So human brain evolved around visual receptors. And visual knowledge is a lot more engaging and interestingly fun!

For a few years, I have special interest in network diagrams and graph visualisations. I noticed that graphs are so special that they can be used as a medium to turn boring text into more visual forms. Graphs are great at emphasizing the connections and relations. And even if you draw and arrow between 2 text segments, it helps human brain to cover knowledge more efficiently.

That's how I thought let's not blame the younger generations. And let's turn the books more fun. I wanted to create a tool that simply converts text knowledge into visual knowledge. And in this way, knowledge could be more engaging, more fun, more visual.

What it does

It converts natural language text into excalidraw drawings. Firstly, it converts text into mermaid.js syntax using the npm library I built: text-to-mermaid. This library is the engine that converts the text into a mermaid.js syntax. And this library is the key part of the project. This library has 3 ways of converting text: deterministic algorithm, Gemini API or a local Large Language Model API. The library generates a mermaid.js syntax. And mermaid.js syntax is converted to excalidraw using @excalidraw/mermaid-to-excalidraw library. To draw the visualization, it uses excalidraw library. While converting text to drawing, it could convert sentence-by-sentence or it could convert the whole text. When the text is converted sentence-by-sentence, it shows each sentece as a scene. When the whole text is converted, it creates only one scene. When there are multiple scenes, user has option to play scenes one-by-one like a video. Of course, the user could use all amazing features of excalidraw library. User could draw more arrows, change the colors, change the background color, add link, add image, export the the whiteboard as image ...

How we built it

Antigravity is used as IDE to develop text-to-mermaid library and "Graph Cinema" apps. Both use vite for bundling the app. And for unit tests, they use vitest. "Graph Cinema" uses react to display user-interface.

Challenges we ran into

  • First of all finding a suitable library for drawing is hard. I noticed Excalidraw is an capable and open-source whiteboard library.
  • After finding the drawing library, parsing text is another challenge. At first, I thought about directly converting text into JSON syntax of excalidraw. But later I discussed this issue with Gemini. It suggested to use mermaid-to-excalidraw library. So first I need wanted to convert text to mermaid syntax. And then I used mermaid-to-excalidraw library.
  • After converting natural language text to excalidraw diagram, I wanted to use images in the diagram. I wanted to convert text to image automatically and replace text and shape with an image. But I couldn't really find a way. Instead I used a simple emojify algorithm to convert text to emojis.
  • Persisting the manually updated diagrams and clearing the cache caused some bugs.
  • Many issues stem from the conversion libraries. I solved these problems with post-processing.
    • Width estimation was wrong for certain shapes.
    • move top-most element if needed
    • normalize linear Excalidraw element points to ensure correct positioning
    • Some text becoming multi-line unnecessarily

Accomplishments that we're proud of

  • Convert any text to whiteboard diagrams with deterministic algorithms. It's not a complex and encapsulating algorithm but it still a good start.
  • Convert any text to whiteboard diagrams with Gemini. Gemini creates more complex structures.
  • Convert any text to whiteboard diagrams with locally hosted LLM.
  • Do the text conversion sentence-by-sentence or as a whole.
  • Use Excalidraw to further customise the visualisation.

What we learned

  • Dividing complex jobs to smaller steps is the way
  • Google AIStudio and Antigravity helps a lot to implement features or fixing bugs. Short and well described prompts give good results.

What's next for Graph Cinema

  • Currently only emojis add whiteboard an image automatically. But somehow creating images and replacing certain text with images could improve the quality. Of course, the conveyed knowledge should not change. It should improve. Always the generated diagram must be a one-to-one conversion of the text.

Built With

Share this project:

Updates