Inspiration
Pipeline integrity engineers constantly deal with inaccurate and inconsistent data from in-line inspection tools. Inspection devices that slip or use different measuring units can cause them to wrongfully identify anomalies. This makes it extremely difficult to determine whether an anomaly is truly growing, new or simply misaligned. These errors lead to unnecessary excavations, which waste time and money. "Inspector Pipe" was created to address this exact problem faced by RCP Engineering.
What it does
Inspector Pipe allows users to upload multiple csv or excel files from pipeline inspection runs. The system automatically aligns these datasets using fixed reference points such as girth welds, correcting for tool drift and measurement inconsistencies. Once aligned, Inspector Pipe matches anomalies across runs and calculates metrics such as corrosion growth rate, anomaly size changes, and potential future defect. It calculates how immediate the anomaly is, or how many days it would take for the anomaly to grow into a threat, based off PHMSA integrity management regulations. Additionally, the platform displays anomalies on an interactive visual map of the pipeline with in depth popups, allowing engineers to quickly see where critical defects are located in respect to relevant landmarks.
How we built it
To build this system, we used FastAPI backend to deal with the heavy data. We used Pandas and NumPy for the math, and SciPy to run specific algorithms like the Hungarian Algorithm for matching defects and piecewise linear interpolation to fix odometer drift. The frontend if made with React running on Vite, with Tailwind CSS to handle the styling. To make the data actually readable and displayable, we plugged in Mapbox GL for the interactive map and Plotly.js for the charts. Finally, for the exported excel report, we used XlsxWriter.
Calculations
Odometer Drift Correction: $$f(target_position) = interp1d(target_gw_positions, baseline_gw_positions, kind="linear", fill_value="extrapolate")$$
Composite Risk Score: $$composite = 0.4 * emergence_density + 0.3 * growth_rate_norm + 0.3 * critical_count_20yr_norm$$
Time to Critical: $$time_to_critical = (80.0 - current_depth_pct) / annual_growth_rate_pct$$
New Anomaly Emergence Density: $$density(pos) = exp(-0.5 * ((pos - center) / 500.0)^2)$$
Hungarian Algorithm Matching:
After odometer correction, anomalies from different runs are matched using the Hungarian Algorithm (optimal bipartite assignment) with a weighted cost matrix.
Cost Matrix
For each pair of anomalies (one from run A, one from run B), the cost is:
| Component | Weight | Calculation | Normalization |
|---|---|---|---|
| Distance | 0.50 | abs(corrected_odo_A - corrected_odo_B) |
clip(distance / max_distance_ft, 0, 1) |
| Clock Position | 0.30 | min(abs(a - b), 12.0 - abs(a - b)) |
circular_distance / 6.0 |
| Feature Type | 0.20 | Category comparison | 0.0 = same, 0.3 = compatible, 1.0 = different |
- Clock Distance: Uses circular arithmetic: the distance between clock positions 11.5 and 0.5 is 1.0, not 11.0.
- Feature Categories:
- Category 0: Metal loss, corrosion, cluster, metal loss manufacturing
- Category 1: Dent, seam weld dent
- Category 2: Other
Pairs with distance > max_distance_ft (default 50 ft) receive a penalty cost of 1,000,000 to prevent distant matches.
Windowed Segmentation
For pipelines with >1000 anomalies, a full N×M cost matrix is infeasible. The pipeline is divided into overlapping windows:
- Window size: 500 ft
- Step size: 400 ft (100 ft overlap)
- Already-matched indices are excluded from subsequent windows to prevent duplicate matches.
Match Output
Each match produces:
match_score: — 1.0 is perfect, 0.0 is worstaccepted:Trueif (default 0.8)match_detail: Component-level breakdown (distance confidence, clock confidence, feature confidence)
Three pairwise matching passes are run:
- 2007 ↔ 2015
- 2015 ↔ 2022
- 2007 ↔ 2022 (direct cross-check)
Other calculations not shown: Depth Growth, Dimension Growth, Composite Risk Score, etc
Challenges we ran into
One of our biggest challenges was building the visualization system off Mapbox that could clearly represent thousands of anomalies along a long pipeline in a way that was both accurate and easy to understand. For the overall system, we experimented with multiple UIs, but settled for one that was compact and easily readable. Other challenges involved developing a reliable way to assign confidence to each matched anomaly and what algorithms and resources to use for accurate forecasting future corrosion or anomaly growth.
Accomplishments that we're proud of
We are especially proud of the visualization dashboard, which allows Inspector Pipe to show where all anomalies are located with respect to the entire pipeline, turning raw engineering data into a real, usable product engineers can actually act on.
What we learned
We learned how to apply machine learning to detect anomalies and forecast their future growth using patterns in past data. Our team started with extremely limited full-stack experience, but we successfully built a complete system and gained hands-on experience integrating Mapbox.
What's next for InspectorPipe
Next, we plan to improve Inspector Pipe's UI, speed, and accuracy while refining our forecasting algorithms based on feedback from RCP Engineering. Additionally, we plan on incorporating environmental data into our machine's analysis to give a more holistic view to enhance our system's anomaly prediction. Our goal is to move the platform towards real-world usage in the field of engineering.
Log in or sign up for Devpost to join the conversation.