Hello everyone! This project was developed by Aya Harrak, Oumama Lemouakni, and Er-Rougui Saad as part of the Morocco High Speed Train Hackathon.
This README provides the steps to execute the codes included in this project.
To run this project, ensure you have the following:
- Operating System: Ubuntu 22.04
- ROS2: Used for fake sensor data publishing.
Installation Guide for ROS2 on Ubuntu - Kafka: Used for real-time data streaming.
[Apache Kafka Installation Guide](https://kafka.apache.org/documentation/quickstart - Apache Spark: Used for data processing.
[Apache Spark Installation Guide](https://spark.apache.org/docs/latest/ - Dependencies for Spark:
Download necessary dependencies (--jarsfiles) and place them in the directory~/spark_jars. - Grafana: Used for data visualization and dashboards.
Grafana Installation Guide
To set up the ROS2-based sensor simulation:
- Navigate to the RailGuards ROS2 workspace:
cd ~/railGuards/ros2
- Build the ROS2 package:
colcon build --symlink-install
2. Source the ROS2 workspace by adding it to your bash configuration:
```bash
echo "source ~/railGuards/ros2/install/setup.bash" >> ~/.bashrcsource ~/.bashrc- Run the sensor simulator:
ros2 run sensors_simulator full_sensor_publisher- Verify the data:
List all ROS2 topics
ros2 topic listEcho a topic, for example:
ros2 topic echo /Oil_temperatureTo set up Kafka for data streaming:
Install Kafka and unzip it into the directory ~/kafka.
Start the Kafka broker and Zookeeper:
~/kafka/bin/zookeeper-server-start.sh ~/kafka/config/zookeeper.propertiesIn another terminal:
~/kafka/bin/kafka-server-start.sh ~/kafka/config/server.properties- Verify Kafka is running:
Verify Kafka is running:
sudo systemctl status kafkaif not :
sudo systemctl start kafkasudo systemctl enable kafka- Launch the ROS2-Kafka bridge to send sensor data to Kafka:
ros2 run sensors_simulator full_ros2_kafka_bridge- Listen to the Kafka topic (fulldata) to view sensor data being streamed:
~/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic fulldataSensor data contains noise and requires processing. To process data using Apache Spark:
- Navigate to the data processing directory:
cd ~/railGuards/data_processing- Run the Spark job:
spark-submit --jars ~/spark_jars/<required-jar-files> full_spark_processing.py3. Verify that the cleaned data is being written to the Kafka topic (cleaned_sensor_data) and InfluxDB.
To visualize the processed data:
Install and set up Grafana. Connect Grafana to InfluxDB as a data source. Import the provided Grafana dashboard JSON (~/railGuards/grafana/grafana_dashboard.json) file or manually create dashboards. Use InfluxDB queries to visualize: Sensor metrics (e.g., temperature, pressure). Maintenance flags and anomaly rates from the Kafka topic future_anomaly_predictions.
cd ~/railGuards/machine_learning/data_preprocessing/then run :
python3 deployement.py