You should use tidylog in your #rstats corpus phonetics workflow

Last week, I asked #rstats twitter for a bit of help with something that has always felt clunky in my R code but was never annoying enough to actually fix. In corpus phonetics, you typically start with a large data set, make measurements, and then use informed criteria to filter out errors to the best of your ability, because measurements can be wrong. When you go to share your findings, you need to report how many items were removed (and why). To do this, you have to keep track. Sure, alternating between filter() and print(nrows(df)) works, but it’s clunky. I’m starting to think that maybe I should have been annoyed earlier. ...

29 Mar 2021 · 5 min · Khia A. Johnson

Simulating data: Take 1

I’m working on a project with a somewhat tricky analysis, and had hit a wall. I didn’t want to just figure the analysis out as I went, for fear of baking researcher bias into the results, but also wasn’t sure if the half-baked analysis would actually answer the question. Enter data simulation, a.k.a. something I should have done a long time ago. ...

10 Feb 2021 · 4 min · Khia A. Johnson