Aman, I watched the video you did with Peter. It was very helpful in confirming what I had thought about the work involved with evaluating the quality of an LLM's outputs. Given that it does take time to set up evals for teams that aren't organised/data driven minded, like defining the rubric/evaluation criteria (+ debates), analyse & score outputs (+ debates), how much time would you say it may take for such a team to get their first version of the AI agent ready to use for an internal/closed user group release? Am asking as my observation is few organisations are rigorous where the subject matter expert is able to define quality criteria to the granular enough level to say "This is good/bad/average.". I sense that it's going to take some time to do the setup work.
++ Good Post. Also, start here : $500K Salary Career Wins, 500+ LLM, RAG, ML System Design Case Studies, 300+ Implemented Projects
https://open.substack.com/pub/naina0405/p/500k-salary-career-wins-500-case?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false
Aman, I watched the video you did with Peter. It was very helpful in confirming what I had thought about the work involved with evaluating the quality of an LLM's outputs. Given that it does take time to set up evals for teams that aren't organised/data driven minded, like defining the rubric/evaluation criteria (+ debates), analyse & score outputs (+ debates), how much time would you say it may take for such a team to get their first version of the AI agent ready to use for an internal/closed user group release? Am asking as my observation is few organisations are rigorous where the subject matter expert is able to define quality criteria to the granular enough level to say "This is good/bad/average.". I sense that it's going to take some time to do the setup work.