This blog is dedicated to exploring the behavior of large language models and generative AI through data analysis and experimentation.
Language Models
Large language models (LLMs) are increasingly integrated into our daily lives (from social media to search engines). However, they are still largely black boxes — researchers and industry experts only scratch the surface in terms of understanding how these models actually work. It is important to probe these models to learn more about their behavior under different circumstances and scenarios.
Data Stories
Telling compelling stories using multiple sources of data is a challenging but rewarding feat. Each dataset on its own may only tell a very limited part of the story, but combined with datasets often lead to interesting results and conclusions.
The Forbidden Pages: A Data Analysis of Book Bans in the US
How I Cried in 2022: An Analysis of 365 Days of Personal Data
Applied Projects
Large language models such as ChatGPT are increasingly better at generating code. And sometimes, it’s faster (if not better) than I am at creating projects.
How I transformed every news headline into a satire for April Fools
art(fish) simulation: using GPT-4 to recreate my first ever computer science project
About Me
I’m Yennie, a machine learning engineer and AI researcher. I currently work at a healthcare startup, where I train large language models on data such as electronic health records. I have previously worked as an AI researcher, data scientist, and software engineer with the University of Oxford, Deeplearning.AI, OpenAI, the United Nations, and Microsoft.
