Geometry Parsing Reference¶
Detailed reference for HEC-RAS geometry file parsing.
FORTRAN-Era Format Conventions¶
HEC-RAS geometry files inherit formatting from FORTRAN punch-card era conventions. Understanding this legacy is essential for correct parsing.
Fixed-Width Column Structure¶
HEC-RAS uses position-based parsing, not whitespace splitting. Each value occupies a fixed column width:
8-Character Columns (1D Data):
Columns: 0-7 8-15 16-23 24-31 32-39 40-47 48-55 56-63 64-71 72-79
Values: sta1 elev1 sta2 elev2 sta3 elev3 sta4 elev4 sta5 elev5
Example: " 0 660.41 5 660.61 40 659.85 45 659.61 50 659.51"
^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^
8 chars 8 chars 8 chars 8 chars 8 chars 8 chars 8 chars 8 chars 8 chars 8 chars
16-Character Columns (2D Coordinates):
Columns: 0-15 16-31 32-47 48-63
Values: X1 Y1 X2 Y2
Example: " 648224.43125 4551425.84375 648230.12500 4551430.87500"
Why 80 Characters?¶
The 80-character line limit comes from IBM punch cards (1960s-1980s). HEC-RAS maintains this convention:
- 10 values × 8 chars = 80 characters per line
- Last line may have fewer values
- Whitespace is structural data, not formatting
Alignment Rules¶
| Rule | Description | Example |
|---|---|---|
| Right-aligned | Values align to right edge of column | " 27.2" not "27.2 " |
| Left-padded | Spaces fill unused left portion | " 0" for zero |
| No separators | Adjacent values touch | " 660.41 5" |
Plain Text vs HDF Storage¶
| Data Type | Plain Text (.g##) | HDF (.g##.hdf) | Primary Source |
|---|---|---|---|
| Cross sections | ✓ Sta/Elev, Mann n | ✓ Computed HTAB | Plain text |
| 2D flow areas | ✓ Perimeter, params | ✓ Full mesh | Plain text input |
| Storage areas | ✓ Definitions | ✓ Full data | Both |
| Connections | ✓ Weir profile, gates | ✓ Indexed | Plain text |
| Pipe networks | ✗ NOT PRESENT | ✓ EXCLUSIVE | HDF only |
Never Edit HDF Directly
HDF files are regenerated by HEC-RAS when you run the geometry preprocessor. Always edit plain text files and let HEC-RAS recompute the HDF.
Cross Section Format¶
Header Line¶
| Field | Description |
|---|---|
| type | Cross section type (1=standard) |
| rs | River station |
| lob_len | Left overbank reach length |
| ch_len | Channel reach length |
| rob_len | Right overbank reach length |
Station-Elevation¶
Followed by pairs of station and elevation values in 8-character fixed-width format:
Manning's n¶
Three Format Variations:
| Format | Count | Flag1 | Flag2 | Description |
|---|---|---|---|---|
| Standard L-MC-R | 3 | 0 | 0 | Left-MainChannel-Right (legacy) |
| Standard Modern | 3 | -1 | 0 | L-MC-R with modern spacing |
| Variable Segments | N | -1 | 0 | N custom roughness zones |
| Vertical Variation | 0 | -1 | 0 | Depth-dependent roughness |
Flag Interpretation:
count: Number of Manning's n segments (triplets of values)flag1:0= old format (space after comma),-1= modern formatflag2: Always0in observed data (reserved)
Data Line Format (triplets):
Each segment is a triplet: (station, n_value, flag) where flag is always 0.
Standard L-MC-R Format Example:
- Segment 1: station=0, n=0.06 (left overbank)
- Segment 2: station=190, n=0.04 (main channel)
- Segment 3: station=375, n=0.10 (right overbank)
Bank Station Alignment
In standard 3-segment format, segment boundary stations [190, 375] EXACTLY match bank stations.
Variable Segment Format Example:
#Mann= 5 ,-1,0
4345.32 .07 0 4948.02 .03 0 5054.3 .07 0
5072.6 .12 0 5234.3 .07 0
Bank Sta=4939,5101.7
- 5 segments allow more granular roughness zones
- Segment boundaries do NOT necessarily match bank stations
- First segment station may be negative (observed: -491.95, -353.01, etc.)
Vertical Variation Format:
#Mann= 0 ,-1,0
Bank Sta=2534.05,2632.02
Vertical n Elevations= 2
32 38
Vertical n for Station=0
.12 .24
- Count=0 means no horizontal segments
- Manning's n varies by ELEVATION, not station
- Used for depth-dependent roughness (advanced feature)
Bank Stations¶
Storage Area Format¶
Connection Format¶
SA/2D Connection¶
Connection={name}
Connection HT={htab_params}
Conn Weir WD={width},{weir_coef}
Conn Weir Embankment={emb_ss},{emb_bottom_w}
Conn Weir Sta Elev= {count}
{sta_1} {elev_1}
{sta_2} {elev_2}
Gate Data¶
Conn Gate Name={gate_name}
Conn Gate Groups={gate_type},{num_gates},{width},{height}
Conn Gate Invert={invert_elev}
Inline Structure Format¶
Inline Weir¶
Type RM Length L Ch R = 3 ,{rs},{lob_len},{ch_len},{rob_len}
Inline Structure Sta Elev= {count}
{sta_1} {elev_1}
Bridge¶
Type RM Length L Ch R = 2 ,{rs},{lob_len},{ch_len},{rob_len}
BEGIN DECK/ROADWAY DATA
Deck Sta Lo Hi={sta_lo},{sta_hi}
Deck Elev={elev_lo},{elev_hi}
END DECK/ROADWAY DATA
Culvert¶
Type RM Length L Ch R = 2 ,{rs},{lob_len},{ch_len},{rob_len}
Bridge Culvert-{flags}
Culvert={shape},{span},{rise},{length},{mannings_n},{entrance_loss},{exit_loss},{inlet_type},{outlet_type},{upstream_invert},{upstream_station},{downstream_invert},{downstream_station},{name},{culvert_code},{chart_number}
Culvert Bottom n={bottom_n}
Culvert Bottom Depth={bottom_depth}
Multi-barrel culverts use a related header with station pairs on fixed-width continuation lines:
Multiple Barrel Culv={shape},{span},{rise},{length},{mannings_n},{entrance_loss},{exit_loss},{inlet_type},{outlet_type},{upstream_invert},{downstream_invert},{num_barrels},{name},{culvert_code},{chart_number}
{upstream_station_1}{downstream_station_1}{upstream_station_2}{downstream_station_2}...
Culvert Bottom n={bottom_n}
Shape Codes:
| Code | Shape |
|---|---|
| 1 | Circular |
| 2 | Box |
| 3 | Pipe Arch |
| 4 | Ellipse |
| 5 | Arch |
| 6 | Semi-Circle |
| 7 | Low Profile Arch |
| 8 | High Profile Arch |
| 9 | Con Span |
For validation-grade chart/scale combinations, barrel/group limits, GUI field labels, and HDF storage mapping, see Culvert Taxonomy.
GeomCulvert.get_culverts() returns both record types with a common schema. Single-barrel Culvert= records populate UpstreamStation, DownstreamStation, and BarrelStations=[(upstream, downstream)]. Multiple Barrel Culv= records populate NumBarrels, BarrelStations, UpstreamStations, and DownstreamStations; UpstreamStation and DownstreamStation are only set when there is exactly one barrel pair.
Use GeomCulvert.set_culverts() to replace the culvert records at an existing bridge/culvert structure. Use GeomCulvert.set_culvert() to update one record by culvert_index or culvert_name, or append a new record when no selector is supplied. Both methods validate shape codes/names, required fields, and multi-barrel station-pair counts before modifying the file. A .bak backup is created before writing.
Adjacent ineffective-flow coordination is handled through GeomCulvert.get_adjacent_cross_sections() and GeomCulvert.set_adjacent_ineffective_flow(), which delegate the actual cross-section writes to GeomCrossSection.set_ineffective_flow().
Parsing Rules¶
Fixed-Width Fields¶
HEC-RAS uses FORTRAN-style fixed-width formatting (legacy from 80-column punch cards):
- 8 characters per field (most common for 1D data)
- 16 characters for 2D coordinates
- Right-justified with left space padding
- 10 values per line = 80 characters
# Parse 8-character fixed-width
def parse_fixed_width(line, width=8):
values = []
for i in range(0, len(line.rstrip()), width):
field = line[i:i+width].strip()
if field:
try:
values.append(float(field))
except ValueError:
# Handle merged values with regex fallback
import re
parts = re.findall(r'-?\d+\.?\d*', field)
values.extend([float(p) for p in parts])
return values
Critical: Use Column Position, Not Whitespace
NEVER use .split() on fixed-width sections. Whitespace is structural data, not formatting.
Count Interpretation Rules¶
Critical Section
Count interpretation varies by keyword context. Misinterpreting counts is the most common parsing bug.
Lines starting with # indicate counts, but the count meaning differs:
| Keyword | Count Meaning | Total Values | Formula |
|---|---|---|---|
#Sta/Elev= |
Number of PAIRS | count × 2 | 40 → 80 values |
#Mann= |
Number of SEGMENTS | count × 3 | 3 → 9 values |
Reach XY= |
Number of PAIRS | count × 2 | 591 → 1182 values |
Storage Area Surface Line= |
Number of POINTS | count × 2 | 117 → 234 values |
Storage Area Elev Volume= |
Number of PAIRS | count × 2 | 53 → 106 values |
Connection Line= |
Number of POINTS | count × 2 | 18 → 36 values |
Levee= |
Explicit count | count total | 12 , 0 → 12 values |
Examples:
# Station/Elevation: 40 PAIRS = 80 total values
count = int(line.split('=')[1].strip())
total_values = count * 2
# Manning's n: 3 SEGMENTS = 9 total values (3 triplets)
parts = line.split('=')[1].split(',')
count = int(parts[0].strip())
total_values = count * 3 # station, n_value, flag for each segment
Continuation¶
Data continues until next keyword or end of section. Last line may have fewer values than full line.
Point Limits¶
| Element | Limit |
|---|---|
| Cross section points | 450 |
| Weir profile points | 500 |
| Rating curve points | 100 |
Bank Station Interpolation¶
When setting station-elevation data, bank stations may need interpolation:
def interpolate_bank(sta_elev_df, bank_station):
"""Interpolate elevation at bank station if not exact match."""
if bank_station in sta_elev_df['station'].values:
return sta_elev_df # Bank already on point
# Find bracketing stations
lower = sta_elev_df[sta_elev_df['station'] < bank_station].iloc[-1]
upper = sta_elev_df[sta_elev_df['station'] > bank_station].iloc[0]
# Linear interpolation
ratio = (bank_station - lower['station']) / (upper['station'] - lower['station'])
elev = lower['elevation'] + ratio * (upper['elevation'] - lower['elevation'])
# Insert new point
new_row = pd.DataFrame({'station': [bank_station], 'elevation': [elev]})
result = pd.concat([sta_elev_df, new_row]).sort_values('station').reset_index(drop=True)
return result
Coordinate Systems¶
Geometry files may include projection information:
GIS Projection Zone=0
GIS Projection=PROJCS["NAD_1983_StatePlane_Texas_Central_FIPS_4203_Feet"...
Edge Cases and Pitfalls¶
Merged Values in Fixed-Width¶
Problem: Numbers may run together without whitespace separators.
Solution: Use regex fallback:
try:
value = float(value_str)
except ValueError:
import re
parts = re.findall(r'-?\d+\.?\d*', value_str)
values.extend([float(p) for p in parts])
2D Coordinates Exceeding Column Width¶
Problem: 2D coordinates often exceed 16 characters:
Solution: Use flexible parsing for coordinate sections, not strict fixed-width.
Empty or Zero Counts¶
Problem: Section exists but has zero items.
Solution: Check count before parsing data:
Decimal Point Variations¶
Problem: Values may be .06 or 0.06.
Solution: Python's float() handles both formats correctly.
Version-Specific Keywords¶
Some keywords differ between HEC-RAS versions. Check version in geometry file header when implementing parsers.
Validation Strategies¶
Cross-Validate with HDF¶
Always compare parsed text with HDF data when available:
def validate_cross_section(txt_pairs, hdf_path, xs_index):
"""Validate text parsing against HDF."""
import h5py
import numpy as np
with h5py.File(hdf_path, 'r') as f:
info = f['Geometry/Cross Sections/Station Elevation Info'][xs_index]
start, count = info[0], info[1]
hdf_pairs = f['Geometry/Cross Sections/Station Elevation Values'][start:start+count]
return np.allclose(txt_pairs, hdf_pairs, rtol=1e-5)
Count Validation¶
Always verify parsed count matches declared count:
declared_count = 40 # From '#Sta/Elev= 40'
total_values = declared_count * 2 # 80 values expected
values = parse_values(lines, start, end)
assert len(values) == total_values, f"Expected {total_values}, got {len(values)}"
Range Checks¶
Verify physical reasonableness:
# Manning's n typically 0.01-0.20
assert all(0 < n < 1.0 for n in mannings_n), "Invalid Manning's n"
# Stations should be monotonic (usually)
assert all(stations[i] <= stations[i+1] for i in range(len(stations)-1))
Implementation Patterns¶
Patterns for developers extending ras-commander or writing custom parsers.
State Machine Pattern for Section Parsing¶
HEC-RAS geometry files have hierarchical structure. Use a state machine to track context:
class GeometryParser:
"""State machine for parsing geometry sections."""
def __init__(self):
self.state = 'INITIAL'
self.current_river = None
self.current_reach = None
self.current_xs = None
def parse_line(self, line: str):
"""Process line based on current state."""
if line.startswith('River Reach='):
self.state = 'IN_REACH'
parts = line.split('=')[1].split(',')
self.current_river = parts[0].strip()
self.current_reach = parts[1].strip()
elif line.startswith('Type RM Length'):
self.state = 'IN_CROSS_SECTION'
# Parse cross section header
elif line.startswith('#Sta/Elev='):
self.state = 'READING_STA_ELEV'
self.expected_pairs = int(line.split('=')[1])
elif self.state == 'READING_STA_ELEV':
# Parse fixed-width values until count reached
pass
Key States:
- INITIAL - Before any section
- IN_REACH - Inside River Reach block
- IN_CROSS_SECTION - Inside cross section
- READING_* - Consuming multi-line data
Backup-Modify-Write Pattern¶
Always create backups before modifying geometry files:
from pathlib import Path
import shutil
from datetime import datetime
def safe_modify_geometry(geom_path: Path, modify_func):
"""Safely modify geometry with automatic backup."""
# 1. Create timestamped backup
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
backup_path = geom_path.with_suffix(f'.{timestamp}.bak')
shutil.copy2(geom_path, backup_path)
try:
# 2. Read current content
content = geom_path.read_text()
# 3. Apply modifications
modified = modify_func(content)
# 4. Write atomically (temp file + rename)
temp_path = geom_path.with_suffix('.tmp')
temp_path.write_text(modified)
temp_path.replace(geom_path)
# 5. Clear geometry preprocessor
from ras_commander import RasGeo
RasGeo.clear_geompre_files()
return True
except Exception as e:
# Restore from backup on failure
shutil.copy2(backup_path, geom_path)
raise RuntimeError(f"Modification failed, restored backup: {e}")
Always Clear Geompre
After ANY geometry modification, call RasGeo.clear_geompre_files() to force HEC-RAS to regenerate hydraulic tables.
HDF Reading Pattern¶
Use context managers and handle missing datasets gracefully:
import h5py
import numpy as np
from pathlib import Path
from typing import Optional
def safe_read_dataset(
hdf_path: Path,
dataset_path: str,
default: Optional[np.ndarray] = None
) -> Optional[np.ndarray]:
"""Read HDF dataset with graceful fallback."""
try:
with h5py.File(hdf_path, 'r') as hdf:
if dataset_path not in hdf:
return default
data = hdf[dataset_path][:]
# Handle byte strings from HDF
if data.dtype.kind == 'S': # Byte string
data = np.char.decode(data, 'utf-8')
return data
except (OSError, KeyError) as e:
logger.warning(f"Could not read {dataset_path}: {e}")
return default
def iter_hdf_groups(hdf_path: Path, base_path: str):
"""Iterate over groups in HDF file."""
with h5py.File(hdf_path, 'r') as hdf:
if base_path not in hdf:
return
base = hdf[base_path]
for name in base.keys():
if isinstance(base[name], h5py.Group):
yield name, base[name]
Section Extraction Pattern¶
Extract specific sections from geometry files:
import re
from typing import Dict, List, Tuple
def extract_sections(
content: str,
start_pattern: str,
end_patterns: List[str]
) -> List[Tuple[int, int, str]]:
"""
Extract sections matching start pattern.
Returns list of (start_line, end_line, section_text) tuples.
"""
lines = content.split('\n')
sections = []
in_section = False
section_start = 0
section_lines = []
for i, line in enumerate(lines):
if re.match(start_pattern, line):
in_section = True
section_start = i
section_lines = [line]
elif in_section:
# Check if we've hit an end pattern
if any(re.match(p, line) for p in end_patterns):
sections.append((
section_start,
i - 1,
'\n'.join(section_lines)
))
in_section = False
section_lines = []
else:
section_lines.append(line)
# Handle section at end of file
if in_section and section_lines:
sections.append((
section_start,
len(lines) - 1,
'\n'.join(section_lines)
))
return sections
# Example: Extract all cross sections
xs_sections = extract_sections(
geometry_content,
start_pattern=r'^Type RM Length',
end_patterns=[r'^Type RM Length', r'^River Reach=', r'^$']
)
See Also¶
- Geometry Operations - Using RasGeometry class
- HEC-RAS File Formats - File naming conventions