Program Status: Early Access Program (EAP) Last Updated: 2025-01-12 Target GA Release: IRIS 2026.1
This document lists current limitations, known bugs, and workarounds for IntegratedML Custom Models during the Early Access Program. Please review this document before reporting issues to avoid duplicate reports.
Before reporting a bug, check:
- ✅ Is it listed in this document?
- ✅ Is there a documented workaround?
- ✅ Have you tried the troubleshooting guide?
If the answer is "no" to all three, please report via feedback channels.
These are intentional design limitations in the current EAP release. Some may be addressed in future releases based on user feedback.
Issue: Direct timeseries model integration (e.g., ARIMA, Prophet as standalone models) is not yet fully supported in the pluggable models architecture.
Impact: You cannot directly plug in pure timeseries models that expect sequential data without additional wrapper logic.
Workaround:
- ✅ Use the Sales Forecasting demo as a reference - it shows how to use Prophet within a hybrid model
- ✅ Create a wrapper class that converts IRIS tabular data to timeseries format
- ✅ Combine timeseries models with traditional ML models (as shown in
HybridForecastingModel)
Status: Under investigation for GA release. Feedback welcome on desired timeseries model integration patterns.
Example Workaround (from Sales Forecasting demo):
class HybridForecastingModel(RegressionModel):
"""Combines Prophet (timeseries) with LightGBM (regression)"""
def fit(self, X, y):
# Convert tabular data to Prophet format
prophet_data = self._prepare_prophet_data(X, y)
self.prophet_model.fit(prophet_data)
# Use Prophet predictions as features for LightGBM
prophet_predictions = self.prophet_model.predict(...)
enhanced_features = self._add_prophet_features(X, prophet_predictions)
self.lgbm_model.fit(enhanced_features, y)Issue: Model names must be globally unique across all custom models (classifiers and regressors).
Impact: If two models have the same class name (even in different files), the last one loaded will override the first.
Workaround:
- ✅ Use descriptive, unique class names:
CreditRiskClassifier,FraudEnsembleDetector, etc. - ✅ Include domain or use case in model name:
SalesForecastHybridModel - ❌ Avoid generic names:
Model,Classifier,Predictor
Status: This is a current architecture constraint. May be relaxed in GA with namespace support.
Issue: Custom models must implement all required methods of the scikit-learn-like interface:
Required methods:
fit(X, y)- Train the modelpredict(X)- Make predictionspredict_proba(X)- Predict class probabilities (classification only)get_params(deep=True)- Get model parametersset_params(**params)- Set model parameters
Impact: Models missing any required method will fail at training or prediction time with unclear error messages.
Workaround:
- ✅ Inherit from
ClassificationModel,RegressionModel, orEnsembleModelbase classes (recommended) - ✅ Review base class implementations in
shared/models/for correct patterns - ✅ Test all required methods before deploying to IRIS
Status: Working as designed. Enhanced error messages planned for GA.
Issue: EAP testing has been primarily conducted on macOS. Linux and Windows support is secondary.
Impact:
- Installation may require platform-specific troubleshooting on Linux/Windows
- Some demo data generation scripts may have platform-specific path issues
- Docker setup is most reliable across platforms
Workaround:
- ✅ Recommended: Use Docker setup (
make setup) for most reliable cross-platform experience - ✅ Linux users: Generally works well, minor path issues possible
⚠️ Windows users: Use WSL2 or Docker for best results
Status: Full multi-platform testing planned before GA. Please report platform-specific issues!
Tested Platforms:
- ✅ macOS 13+ (Ventura, Sonoma) - Primary
⚠️ Ubuntu 22.04 LTS - Secondary testing⚠️ Windows 11 + WSL2 - Limited testing⚠️ Windows 11 + Docker Desktop - Limited testing
Issue: Requires Python 3.8 or later. Python 3.11+ recommended for full AutoML compatibility.
Impact: Older Python installations (3.6, 3.7) are not supported.
Workaround:
- ✅ Use
pyenvorcondato install Python 3.8+ - ✅ Docker setup includes correct Python version automatically
Status: This is a dependency requirement and will not change. Python 3.8+ is required.
Issue: After modifying a custom model Python file, you must restart the IRIS terminal (or IRIS instance) for changes to take effect.
Impact: Iterative development is slower - each model change requires a restart.
Workaround:
- ✅ Develop and test models in standard Python/Jupyter environment first
- ✅ Unit test your model with pytest before deploying to IRIS
- ✅ Only deploy to IRIS once model logic is working
⚠️ For IRIS testing, restart the terminal after each model update:# In IRIS terminal halt # Then reconnect or restart IRIS container docker restart iml-custom-models-iris
Status: This is a current architecture limitation. Hot-reload functionality is being investigated for GA.
Example Development Workflow:
# 1. Develop model locally
cd demos/my_demo/
pytest tests/test_my_model.py # Unit test outside IRIS
# 2. Deploy to IRIS
cp models/my_model.py /path/to/iris/mgr/python/custom_models/
# 3. Restart IRIS
docker restart iml-custom-models-iris
# 4. Test in SQL
# ... run SQL commands ...Issue: InterSystems' built-in AutoML models are exposed as .py files in /iris/Mgr/python/AutoML/. You can modify or remove these files.
Impact:
⚠️ Modifying built-in models can break AutoML functionality⚠️ Removing built-in models will cause AutoML to fail⚠️ Changes persist and affect all users of the IRIS instance
Workaround:
- ❌ Do NOT modify files in
/iris/Mgr/python/AutoML/Classifiers/orRegressors/ - ✅ Place custom models in a separate directory (e.g.,
/iris/Mgr/python/custom_models/) - ✅ Use
pathtoclassifiersandpathtoregressorsparameters to point to custom directories
Status: This is a current architecture design. Better isolation planned for GA.
Recommended Directory Structure:
/iris/Mgr/python/
├── AutoML/ # DO NOT MODIFY
│ ├── Classifiers/ # Built-in AutoML classifiers
│ └── Regressors/ # Built-in AutoML regressors
└── custom_models/ # Your custom models
├── classifiers/
│ ├── credit_risk_classifier.py
│ └── fraud_detector.py
└── regressors/
└── sales_forecaster.py
Issue: Training very large models (>1GB in memory, >1 hour training time) may exceed SQL timeout limits.
Impact: SQL TRAIN MODEL command may time out before training completes.
Workaround:
- ✅ Pre-train large models outside IRIS using standard Python
- ✅ Save trained model using pickle/joblib
- ✅ Load pre-trained model in custom model's
__init__()method - ✅ Override
fit()to skip training if model is already trained ⚠️ For production: Consider incremental learning or model updates outside IRIS
Status: This is a SQL execution timeout limitation. Async training is being considered for GA.
Example Pre-Trained Model Pattern:
class PreTrainedClassifier(ClassificationModel):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# Load pre-trained model
model_path = kwargs.get('pretrained_model_path')
if model_path and os.path.exists(model_path):
self.model = joblib.load(model_path)
self._is_fitted = True
else:
self.model = MyLargeModel()
self._is_fitted = False
def fit(self, X, y):
if self._is_fitted:
return self # Skip training if already fitted
# Otherwise train as normal
self.model.fit(X, y)
return selfIssue: Very large custom model files (>100MB) may cause slow load times or memory issues.
Impact: IRIS may take a long time to load large model files, affecting TRAIN MODEL performance.
Workaround:
- ✅ Keep model code separate from model weights
- ✅ Load model weights from external files in
__init__() - ✅ Use efficient serialization (joblib, pickle) for model weights
- ✅ Consider model compression techniques
Status: This is a general Python module loading limitation. Best practices documentation will be expanded for GA.
Issue: Current demos cover common use cases, but advanced patterns (model ensembles, custom metrics, complex preprocessing) have limited documentation.
Impact: Users attempting advanced use cases may need to reverse-engineer patterns from demo code.
Workaround:
- ✅ Review demo source code in
demos/*/models/for patterns - ✅ Check
shared/models/for base class implementations - ✅ Consult API reference documentation
- ✅ Contact support for specific advanced use case guidance
Status: Expanding advanced examples based on EAP feedback. Please share your advanced use cases!
Areas We Want Feedback On:
- Custom loss functions
- Multi-output models
- Streaming/incremental learning
- Model interpretability (SHAP, LIME integration)
- A/B testing patterns
- Model monitoring and drift detection
Issue: Security best practices, performance tuning, and operational considerations are documented but not comprehensive.
Impact: Users may miss important production considerations.
Workaround:
- ✅ Review
deployment.mdfor current guidance - ✅ Use EAP period to evaluate production readiness
- ✅ Provide feedback on missing operational considerations
Status: Production documentation will be expanded in Phase 2 based on EAP feedback.
These are confirmed bugs that will be fixed before GA or have documented workarounds.
Severity: Medium
Description: On some Linux distributions, Docker volume mounts may have incorrect permissions, preventing IRIS from writing to /opt/irisapp/data.
Symptoms:
- IRIS container fails to start
- Error: "Permission denied" in IRIS logs
- Models cannot be loaded
Workaround:
# Option 1: Fix volume permissions
sudo chown -R 51773:51773 ./data
# Option 2: Use docker-compose with user override
docker-compose run --user root iris bash
chown -R 51773:51773 /opt/irisapp/data
exit
docker-compose up -dStatus: Investigating fix for GA. Docker-compose configuration will be updated.
Tracking: Related to JIRA tickets on Docker deployment
Severity: Low
Description: On fresh IRIS installations, the required symlink from /usr/irissys/mgr/python/iris_automl to /opt/irisapp/data/mgr/python/iris_automl may not exist.
Symptoms:
TRAIN MODELfails with "Module not found: iris_automl"- AutoML provider not available
Workaround:
# Connect to IRIS container
docker exec -it iml-custom-models-iris bash
# Create symlink
ln -sf /usr/irissys/mgr/python/iris_automl /opt/irisapp/data/mgr/python/iris_automl
# Restart IRIS
exit
docker restart iml-custom-models-irisStatus: Will be automated in installation scripts for GA.
Tracking: Installation automation improvements
Severity: Medium
Description: When a custom model is missing a required method (fit, predict, etc.), the error message is unclear and doesn't indicate which method is missing.
Symptoms:
- Generic Python exception during
TRAIN MODELorPREDICT() - Error message: "AttributeError" without specifying missing method
Workaround:
- ✅ Always inherit from
ClassificationModelorRegressionModelbase classes - ✅ Test your model with pytest before deploying to IRIS
- ✅ Review base class interface in
shared/models/base.py
Status: Enhanced error messages planned for GA.
Tracking: Validation and error handling improvements
Severity: Low
Description: Models that contain complex nested objects (e.g., custom transformers, third-party models) may fail to serialize correctly during TRAIN MODEL.
Symptoms:
TRAIN MODELcompletes but model state is not savedPREDICT()fails because model is not fitted- Error: "PickleError" or "Serialization failed"
Workaround:
- ✅ Override
_get_model_state()and_set_model_state()methods - ✅ Manually serialize complex objects using joblib or pickle
- ✅ Review
EnsembleFraudDetectorfor example of complex state management
Example:
def _get_model_state(self):
"""Custom serialization for complex model"""
return {
'model': joblib.dumps(self.model), # Use joblib for complex objects
'preprocessor': pickle.dumps(self.preprocessor),
'metadata': self.metadata # Simple objects can be direct
}
def _set_model_state(self, state):
"""Custom deserialization"""
self.model = joblib.loads(state['model'])
self.preprocessor = pickle.loads(state['preprocessor'])
self.metadata = state['metadata']Status: Documentation will be improved with serialization best practices.
Tracking: State management enhancements
Severity: Low
Description: Invalid JSON in the USING clause may produce unclear error messages instead of JSON validation errors.
Symptoms:
- SQL syntax error with unclear message
- JSON parsing fails silently
Workaround:
- ✅ Validate JSON syntax before using in SQL (use a JSON validator)
- ✅ Use single quotes for JSON in SQL:
USING '{"param": "value"}' - ✅ Escape double quotes if using double-quoted JSON
Status: Better JSON validation and error messages planned for GA.
Example:
-- ✅ CORRECT: Single quotes around JSON
TRAIN MODEL my_model
USING '{"model_name": "MyClassifier", "user_params": {"param1": 1}}'
-- ❌ INCORRECT: Unescaped double quotes
TRAIN MODEL my_model
USING {"model_name": "MyClassifier"} -- Will failTracking: SQL parameter validation improvements
Severity: High (breaking — silent failure)
Description: JSON parameter keys in the USING clause that contain underscores are silently ignored by IRIS. The AutoML engine only recognizes concatenated (no-underscore) parameter names.
Affected parameters:
| Wrong (silently ignored) | Correct |
|---|---|
path_to_classifiers |
pathtoclassifiers |
path_to_regressors |
pathtoregressors |
isc_models_disabled |
iscmodelsdisabled |
Symptoms:
TRAIN MODELcompletes with no custom model loaded- IRIS falls back to built-in AutoML classifiers or raises
NoEstimatorChosen - No error is raised for unrecognised keys
Workaround:
CREATE MODEL FraudDetectionEnsemble PREDICTING (is_fraud) FROM TransactionData
USING {"pathtoclassifiers": "/opt/irisapp/demos/fraud_detection/iris_models", "iscmodelsdisabled": 1}Additional requirements:
- The directory must contain
.pyfiles each defining a class named exactlyIRISModel IRISModelfiles must be fully self-contained (no imports from this repo)IRISModelmust exposeself.model,fit(X, y),predict(X),predict_proba(X),get_params(),set_params(**params)- Use
StandardScaler(with_mean=False)— IRIS passes sparse matrices during cross-validation
Status: Documentation gap; parameter names are fixed in the IRIS AutoML engine. All demo SQL files in this repo have been updated to use the correct names.
Tracking: FR-018
Severity: Low
Description: SELECT ... PREDICT() on very large tables (>1M rows) may be slower than expected.
Impact: Batch predictions on large datasets may take several minutes.
Workaround:
- ✅ Use
WHEREclauses to limit prediction scope - ✅ Batch predictions in smaller chunks (e.g., 10K-100K rows at a time)
- ✅ Consider materialized views for frequently-used predictions
- ✅ Use SQL
TOPorLIMITfor testing
Status: Performance optimization ongoing. Feedback welcome on specific performance requirements.
Example:
-- ✅ Good: Batch predictions
SELECT TOP 10000 id,
PREDICT(MyModel) as prediction
FROM LargeTable
WHERE prediction_date = CURRENT_DATE
-- ⚠️ Slow: Full table prediction
SELECT id, PREDICT(MyModel) as prediction
FROM LargeTable -- 5M rowsTracking: Query performance optimization
Severity: Low
Description: The sales forecasting demo's data generation script can take 2-3 minutes to generate realistic multi-year timeseries data.
Impact: First run of sales demo is slower than other demos.
Workaround:
- ✅ This is expected behavior - realistic timeseries data generation takes time
- ✅ Data is cached after first generation
- ✅ Reduce data volume in config for faster testing (see demo README)
Status: Acceptable for demo purposes. Pre-generated data may be provided for GA.
Severity: Low
Description: DNA similarity demo requires biopython which may not be installed by default.
Symptoms:
- Import error: "No module named 'Bio'"
- Demo fails to run
Workaround:
# Install biopython
pip install biopython
# Or use demo-specific requirements
pip install -r demos/dna_similarity/requirements.txtStatus: Will be documented more clearly in installation guide.
Tracking: Demo dependency documentation
When you encounter an issue:
- Check this document - Is it a known issue?
- Check
TROUBLESHOOTING.md- Common solutions - Check
EAP_FAQ.md- Frequently asked questions - Search JIRA (if you have access) - Existing bug reports
- Contact support - Email thomas.dyar@intersystems.com
| Issue | Quick Fix |
|---|---|
| Terminal restart needed | docker restart iml-custom-models-iris |
| Permission denied (Linux) | sudo chown -R 51773:51773 ./data |
| Module not found | Verify model file placement, check symlink |
| JSON syntax error | Use single quotes in SQL: USING '{...}' |
| Slow predictions | Add WHERE clause, batch in smaller chunks |
| Model not fitted | Check serialization, override state methods |
| Import error | Install missing dependencies with pip |
Based on current EAP feedback and internal roadmap:
- ✅ Enhanced error messages for missing model methods
- ✅ Better JSON USING clause validation
- ✅ Improved installation scripts (symlink automation)
- ✅ Hot-reload for model changes (under investigation)
- ✅ Expanded production deployment documentation
- ✅ Multi-platform installation testing
- ✅ Performance optimizations for large predictions
Based on EAP feedback, we're considering:
- Timeseries model native support
- Model namespace/versioning
- Async training for large models
- Model monitoring dashboard integration
- Pre-built model templates library
Your feedback will help prioritize these!
Please confirm:
- ✅ Issue is not listed in this document
- ✅ Issue is not in
TROUBLESHOOTING.md - ✅ Issue is not in
EAP_FAQ.md - ✅ You've tried basic troubleshooting (restart, check logs)
Email (Recommended for bugs): thomas.dyar@intersystems.com
GitHub Issues (If enabled): Use bug report template
For Bug Reports:
**Title**: Brief description (e.g., "Model fails to load on Windows")
**Description**:
What happened? What did you expect to happen?
**Environment**:
- OS: [e.g., macOS 14.1, Ubuntu 22.04, Windows 11]
- Python version: [e.g., 3.11.5]
- IRIS version: [e.g., 2025.2]
- Installation method: [Docker / Local]
**Steps to Reproduce**:
1. Step 1
2. Step 2
3. Step 3
**Error Messages**:
Paste full error messages here
**Screenshots** (if applicable):
[Attach screenshots]
**Workaround Found**: [If you found a workaround, share it!]
This document will be updated during the EAP as new issues are discovered and workarounds are identified.
Last Updated: 2025-01-12 Next Update: As issues are reported during EAP
To get the latest version of this document:
git pull origin mainOr check the repository: https://github.com/intersystems-community/integratedml-custom-models
Thank you for reviewing this document and helping us identify issues during the EAP. Your patience with these limitations and your feedback will make Custom Models better for everyone!
Questions? Email thomas.dyar@intersystems.com
— The InterSystems Data Platforms Product Team