Can AI Predict Tokyo Bay Visibility? Testing the Limits of ML Extrapolation
2026-03-10
Real Tokyo Bay visibility: 2m summer, 6m winter. Our AI predicted 10m — wildly wrong. Why does AI fail at unseen locations? We tested machine learning's extrapolation problem with real data.
The Problem: Our Heatmap Showed Tokyo Bay as "10m Visibility"
Tokyo Bay Visibility: What the Real Data Shows
A 2023 study published in PeerJ by Akada et al.1 measured Secchi depth at 35 monitoring stations in Kanagawa Prefecture from 1963 to 2018. For Tokyo Bay specifically, the data shows:
| Season | Transparency (median) | Notes |
|---|---|---|
| Summer (June–Sept) | ~2m | Phytoplankton bloom at maximum |
| Winter (January) | ~6m | Winter wind-driven vertical mixing |
| Annual range | 2–6m | Even lower values recorded before 1993 |
A December 2024 survey by the Tokyo Metropolitan Island Agriculture, Forestry and Fisheries Research Center2 at the Odaiba monitoring station recorded the Secchi disk reaching the bottom at approximately 4m depth — described as an exceptionally clear day for winter. In spring through autumn, 2m is considered "good" visibility for the site. In other words, Tokyo Bay's transparency maxes out at around 2–6m.
Our AI predicted ~10m. That's 4m above the observed winter maximum, and 5× the observed summer average. So what went wrong?
What Is the "Extrapolation Problem" in Machine Learning?
Machine learning models perform well within the range of their training data (interpolation), but reliability drops sharply when predicting outside that range (extrapolation). This is a fundamental limitation of all statistical models, not just neural networks or gradient boosting.
Our AI's training data
- • 44 dive sites across Japan (44,440 real visibility observations)
- • All collected from dive shop blogs at sites where recreational diving occurs
- • Dive sites inside Tokyo Bay interior: zero
Dive shops naturally locate near diveable water. Tokyo Bay's interior is not a recreational diving destination — visibility data simply does not exist for it. The model had no opportunity to learn what makes Tokyo Bay tick.
How Badly Does Extrapolation Fail?
| Location | AI Prediction | Actual Visibility | Error |
|---|---|---|---|
| Tokyo Bay interior (annual) | ~10m | 2–6m | +4 to +8m (2–5× overestimate) |
| Tokyo Bay (summer) | ~10m | ~2m | +8m (5× overestimate) |
| Izu Oceanic Park (trained) | R²=0.82 | Matches observations well | Small (normal) |
Why Are Enclosed Bays Especially Hard?
Enclosed bays like Tokyo Bay operate under fundamentally different visibility mechanisms than the open-ocean or coastal sites in our training data:
- Eutrophication: River inflows bring massive nutrient loads, triggering phytoplankton blooms
- Resuspension: Shallow depths allow winds and waves to stir up bottom sediment
- Anoxic bottom water: Summer hypoxia creates extreme water quality degradation
- Ship traffic: Dense port operations continually disturb sediment
None of these factors appear in our feature set, and no amount of coastal site data extrapolates correctly into these dynamics.
The Fix: Data Turns Extrapolation into Interpolation
The solution is straightforward: if we had visibility data from inside Tokyo Bay, the model could learn its dynamics. Environmental monitoring agencies and research institutions conduct regular dives in Tokyo Bay. If those records could be incorporated, the model would learn the bay's unique eutrophication-driven transparency patterns and produce accurate predictions.
At sites where we do have training data — even geographically isolated ones like Yonaguni (Japan's westernmost island) or Sado Island in the Sea of Japan — our model achieves R²=0.7–0.8. Data availability is everything.
Key Takeaways
- Tokyo Bay real visibility: 2m in summer, up to 6m in winter (Akada et al., 2023)
- Our AI predicted ~10m — a 5× overestimate in summer
- Root cause: extrapolating outside training data is inherently unreliable
- Solution: collect visibility data from inside Tokyo Bay
On this site, we provide visibility forecasts only for the 44 sites with actual training data, where we achieve up to R²=0.82 accuracy. We do not extrapolate beyond our training region — reliable predictions only.
References
- 1 Akada M, Kodama M, Yamaguchi H. (2023). "Eutrophication trends in the coastal region of the Great Tokyo area based on long-term trends of Secchi depth."PeerJ 11:e15764. https://peerj.com/articles/15764/
- 2 Tokyo Metropolitan Island Agriculture, Forestry and Fisheries Research Center (December 2024). Tokyo Bay Inner Bay Survey Report. Tokyo Metropolitan Government
- 3 Nishijima W et al. (2019). "Distribution of region-specific background Secchi depth in Tokyo Bay and Ise Bay, Japan."Ecological Indicators 98, 133–141.
🌊 Check Visibility Forecasts
View AI-powered 7-day visibility forecasts for 30+ dive sites across Japan.
Open Forecast App →