Herber Project Review

Author

Riley Herber

Published

May 1, 2026

1 Overview

Title of project:

Identifying Key Metadata Predictors of Salmonella AMR Genotypes through Machine Learning

Name of project author(s):

  • Marco Reina

Name of project reviewer:

  • Riley Herber

2 Instructions

Write your comments and feedback below for each section/component of the project. The goal should be to help the author improve their project. Make comments as constructive and actionable as possible. You can provide both criticism and praise.

For each component, pick one summary statement by deleting the ones that do not apply and keeping only the one that you think most closely summarizes a given component.

Make sure your final document compiles/renders into a readable, well-formatted html document.

Delete any sections/text of this template that are not part of your final review document. (Including these instructions.)

3 Specific project content evaluation

Evaluate the different parts of the project by filling in the sections below.

3.1 Background, Context and Motivation

How well is the context of the project described? Is a comprehensive background, including summary of previous/related work given? Is the project well placed into the context of existing work (including proper referencing of existing work). Is it clear why the project was undertaken and what new information it hopes to provide?

3.1.1 Feedback and Comments

The project provides a strong and well-developed background on antimicrobial resistance (AMR), particularly in Salmonella. You effectively connect global public health concerns (e.g., AMR burden projections) with the specific motivation for using metadata-driven approaches. The discussion of surveillance systems like NARMS and NCBI Pathogen Detection is especially helpful in grounding the project in real-world data infrastructure.

3.1.2 Summary assessment

  • strong contextualization and motivation

3.2 Question description

How well and clear are the question(s)/hypotheses the project aims to address described? Is it clear how the questions relate to the data?

3.2.1 Feedback and Comments

I understand you are using metadata predictors to predict AMR status and the ” feasibility of metadata-driven risk stratification,” but what does feasibility mean here? Does it mean the model must preform to a certain standard? Why is it important to eb able to use metadata alone?

3.2.2 Summary assessment

  • question/hypotheses somewhat explained

3.3 Data description

How well is the data overall described? Is the source provided? Is a codebook or other meta-information available that makes it clear what the data is?

3.3.1 Feedback and Comments

I really appreciated the bulleted list of the key vaariables frome ach dataset. This makes it very easy for the reader to visualize how the data looks and what kind of information is found within each dataset. The data description is detailed and informative, particularly in explaining the data sources (NCBI Pathogen Detection, NARMS), dataset size and filtering steps, and key variables and outcome definition.

3.3.2 Summary assessment

  • source and overall structure of data well explained

3.4 Data wrangling and exploratory analysis

How well is the data cleaned/processed and explored? Are all steps reasonable and well explained? Are alternatives discussed and considered? Are meaningful exploratory results shown (e.g. in the supplementary materials)?

3.4.1 Feedback and Comments

The data wrangling process is thorough and well thought out. You explain each step of the data cleaning process, whichout getting to technical.

I like how you organized your EDA section becasue this is an area I often find difficult to summarize. I like that you linked your full EDA analysis, but only chose a few for the manuscript and described each figure and what information was learned from each. ### Summary assessment

  • essentially no weaknesses in wrangling and exploratory component

3.5 Appropriateness of Analysis

Were the analysis methods appropriate for the data? Was the analysis done properly? Were different components of the analysis (e.g. performance measure, variable selection, data pre-processing, model evaluation) done in the best way possible and explained well?

3.5.1 Feedback and Comments

You do a great job of walking your reader through your analysis and explain each step and figure along the way. You explain why a random forest classifier was selected. You also break down the most important predictors within this RF model and how it preformed. These inclusions make it very easy for the reader to undertad the line of logic for your study.

3.5.2 Summary assessment

  • strong and reasonable analysis

3.6 Presentation

How well are results presented? Are tables and figures easy to read and understand? Are the main figures/tables publication level quality?

3.6.1 Feedback and Comments

Your figures were very well presented and professionally done, all of which had descriptive names and links to them in the paragraph where they are discussed, which makes it very easy to understand the interpretations of each figure.

3.6.2 Summary assessment

  • results are very well presented

3.7 Discussion/Conclusions

Are the study findings properly discussed? Are strengths and limitations acknowledged? Are findings interpreted properly?

3.7.1 Feedback and Comments

The discussion is very strong. It effectivey contextualizes your main results, interprets model performance in a public health context, and explains the trade-off between sensitivity and specificity.

The strengths and limitations are also well thoughtout.

3.7.2 Summary assessment

  • strong, complete and clear discussion

3.8 Further comments

I really liked how you made your manuscript a navigable web site, where you can click to each step of the pipeline. I found this very intuitive, and it allowed easy access to all steps of your workflow while reading your manuscipt, if I wanted more information on the process. This is something I am interested in doing for my own future projects.

4 Overall project content evaluation

Evaluate overall features of the project by filling in the sections below.

4.1 Structure

Is the project well structured? Are files in well labeled folders? Do files have reasonable names? Are all “junk” files not needed for analysis/reproduction removed? By just looking at files and folders, can you get an idea of how things fit together?

4.1.1 Feedback and Comments

Though you had a rendered webpage linked in your GitHub, I still ran each part of the workflow on my own to make sure everything worked. Everything pahted to the correct location and having your scripts numbered in order made the workflow very easy to follow.

4.1.2 Summary assessment

  • well structured

4.2 Documentation

How well is the project documented? Are you able to understand each step of the whole analysis, each decision that was made, and each line of code? Is enough information provided as comments in code or as part of Rmd files?

4.2.1 Feedback and Comments

Your code doesnt have many comments and each supplementary script doesn’t have much information on what it is doing. Your readme files also have minimal detail about what is in each folder. The steps mentioned in your manuscript, though, are very easy to follow. ### Summary assessment

  • fully and well documented

4.3 Reproducibility

Are all results fully reproducible? Is documentation provided which clearly explains how to reproduce things, and does it work without the need for any manual intervention? Are you able to re-run the whole analysis without having to do manual interventions/edits?

4.3.1 Feedback and Comments

I did not have any issues rerendering all your scripts and manuscript.

4.3.2 Summary assessment

  • fully reproducible without issues

4.4 Thoroughness

How thorough was the overall study? Were alternatives (e.g. different ways of processing the data or different models) considered? Were alternatives discussed? Were the questions/hypotheses fully and thoroughly addressed?

4.4.1 Feedback and Comments

This study was very thoughtout and there was no lines of logic or context that I felt weren’t explained.

4.4.2 Summary assessment

  • strong level of thorougness

4.5 Further comments

I wish I had more detailed critiques to offer, but I genuinely found very few areas that needed improvement. This project feels like a practically finished product. Everything is clearly presented and well thought out. Great job!