PyIntel Scoutz Scouting Data Structure
This document explains the data structure used in the PyIntel Scoutz application, detailing what information is collected and how it's used in match predictions and team analysis.
Data Sources
PyIntel Scoutz accepts data from multiple sources:
- CSV files (primary data source)
- JSON files
- QR code scanned data
- External API data (Statbotics and The Blue Alliance)
Core Data Fields
Team Identification
Field Name | Data Type | Description |
---|---|---|
teamNumber | string | Unique identifier for each team (e.g., "2056") |
Match Phase Data
Autonomous Phase
Field Name | Data Type | Description |
---|---|---|
auton_CoralScoringLevel1 | number | Count of Level 1 coral scored in autonomous |
auton_CoralScoringLevel2 | number | Count of Level 2 coral scored in autonomous |
auton_CoralScoringLevel3 | number | Count of Level 3 coral scored in autonomous |
auton_CoralScoringLevel4 | number | Count of Level 4 coral scored in autonomous |
auton_AlgaeScoringProcessor | number | Count of algae scored in processor during autonomous |
auton_AlgaeScoringBarge | number | Count of algae scored in barge during autonomous |
auton_LeftBarge | boolean | Whether the robot left the barge in autonomous |
auton_total | number | Calculated - Total autonomous phase score |
Teleoperated Phase
Field Name | Data Type | Description |
---|---|---|
teleop_CoralScoringLevel1 | number | Count of Level 1 coral scored in teleop |
teleop_CoralScoringLevel2 | number | Count of Level 2 coral scored in teleop |
teleop_CoralScoringLevel3 | number | Count of Level 3 coral scored in teleop |
teleop_CoralScoringLevel4 | number | Count of Level 4 coral scored in teleop |
teleop_AlgaeScoringProcessor | number | Count of algae scored in processor during teleop |
teleop_AlgaeScoringBarge | number | Count of algae scored in barge during teleop |
teleop_Defense | boolean | Whether the robot played defense |
teleop_total | number | Calculated - Total teleoperated phase score |
Endgame Phase
Field Name | Data Type | Description |
---|---|---|
endgame_Deep_Climb | boolean | Whether the robot performed a deep climb |
endgame_Shallow_Climb | boolean | Whether the robot performed a shallow climb |
endgame_Park | boolean | Whether the robot parked |
endgame_Comments | string | Qualitative comments from scouts |
endgame_total | number | Calculated - Total endgame phase score |
Derived Metrics
Field Name | Data Type | Description |
---|---|---|
defense_value | number | Calculated - Defense contribution value |
total_score | number | Calculated - Overall match score |
consistency | number | Calculated - 1 / (standard deviation + 1) |
climbing_percentage | number | Calculated - Percentage of matches with successful climbs |
Data Transformation Process
Raw data collected from scouting forms undergoes the following transformations:
- Data Loading: Scouting data is loaded from CSV, JSON, or QR code sources
- Type Conversion: Team numbers are converted to strings, boolean values are normalized
- Score Calculation: Phase scores are calculated based on the scoring formulas
- Aggregation: Data is aggregated by team with statistical operations (mean, std, max)
- Derived Metrics: Additional metrics like consistency ratings are calculated
- Team Profiles: Comprehensive profiles are generated for each team
Sample Data Format
Below is an example of raw scouting data in CSV format:
teamNumber,auton_CoralScoringLevel1,auton_CoralScoringLevel2,auton_LeftBarge,teleop_CoralScoringLevel2,teleop_Defense,endgame_Deep_Climb,endgame_Comments
2056,2,1,TRUE,5,TRUE,TRUE,"Great at defense, consistent climber"
1114,1,3,TRUE,3,FALSE,TRUE,"Excellent autonomous mode"
External API Data
The application enriches team data with external information from:
Statbotics API
- Team performance history
- Win/loss records
- Estimated Prediction Accuracy (EPA) metrics
The Blue Alliance API
- Team demographic information
- Competition history
- Alliance partnerships
Data Storage
Data is stored in multiple formats:
- Raw CSV/JSON files in the input directory
- Unified data in the application's data directory
- Trained machine learning models (serialized via pickle)
Data Security and Privacy
- All data is stored locally by default
- API keys are required for external data sources
- No personally identifiable information is collected
This comprehensive data structure allows the PyIntel Scoutz application to perform detailed analysis and generate accurate match predictions based on multiple performance metrics.