In the summer of 2024 I was a Team Captain with Cohere Lab's Expedition Aya, an open-research effort to improve multilingual AI. I co-led a team of five other researchers around the world to develop a system to automatically detect mislabelled data in the Aya dataset. You can learn more about the Aya project here, and see our presentation slides here. At a glance we...
- developed a system to detect mislabelled data in the world's largest multilingual dataset
- got to contribute to the development of the state-of-the-art multilingual LLM, Aya
- scheduled and coordinated meetings across six time zones (...harder than you'd expect)