Demo Data

Example datasets for testing MKrep analysis tools

Back to Home

Available Demo Datasets

Download these example datasets to test the analysis tools. All files are in the required CSV format with binary (0/1) values.

MIC.csv

Minimum Inhibitory Concentration data for various antibiotics

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: Antibiotics (binary: 0 = susceptible, 1 = resistant)
  • Size: ~3KB, 50+ strains, 15+ antibiotics
Download MIC.csv

AMR_genes.csv

Antimicrobial resistance genes presence/absence matrix

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: AMR genes (binary: 0 = absent, 1 = present)
  • Size: ~5KB, 50+ strains, 30+ genes
Download AMR_genes.csv

Virulence.csv

Virulence factors presence/absence data

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: Virulence factors (binary: 0 = absent, 1 = present)
  • Size: ~20KB, 50+ strains, 100+ factors
Download Virulence.csv

MLST.csv

Multi-Locus Sequence Typing data

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: MLST alleles and sequence types
  • Size: ~2KB, 50+ strains
Download MLST.csv

Serotype.csv

Serological typing information

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: Serotype classifications
Download Serotype.csv

Plasmid.csv

Plasmid presence/absence profiles

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: Plasmid types (binary: 0 = absent, 1 = present)
Download Plasmid.csv

MGE.csv

Mobile Genetic Elements data

  • Rows: Bacterial strains (with Strain_ID)
  • Columns: MGE types (binary: 0 = absent, 1 = present)
Download MGE.csv

Snp_tree.newick

Phylogenetic tree in Newick format (for phylogenetic analyses)

  • Format: Newick tree format
  • Tips: Same strains as in CSV files
Download Snp_tree.newick

Complete Dataset Bundle

Download all files at once from the GitHub repository:

Contains all source code, documentation, and example data

Data Format Requirements

Important Format Guidelines
  • First Column: Must be named "Strain_ID" (case-sensitive)
  • Binary Values: Use only 0 (absence) and 1 (presence)
  • No Missing Values: All cells must contain 0 or 1
  • CSV Format: Comma-separated values
  • Unique IDs: Each Strain_ID must be unique

Example Format:

Strain_ID,Gene1,Gene2,Gene3
Strain001,1,0,1
Strain002,0,1,1
Strain003,1,1,0

Using Demo Data

In Google Colab:

  1. Open a Colab notebook from the main page
  2. When prompted, upload the demo CSV files you downloaded
  3. Run the analysis
  4. Download results

Local Installation:

  1. Clone the repository: git clone https://github.com/MK-vet/MKrep.git
  2. Demo files are already included in the repository
  3. Run any analysis script directly

Need Help?

For more information about data preparation and formatting: