Team TII EuroRacing (TII-ER) developed a full autonomous software stack for oval racing, enabling speeds above 75 m/s (270 km/h). The software includes modules for perception, planning, control, vehicle dynamics modeling, simulation, telemetry, and safety. The team achieved second and third place in the first two Indy Autonomous Challenge events using this stack.
Researchers at MBZUAI have introduced a novel approach to enhance Large Multimodal Models (LMMs) for autonomous driving by integrating 3D tracking information. This method uses a track encoder to embed spatial and temporal data, enriching visual queries and improving the LMM's understanding of driving scenarios. Experiments on DriveLM-nuScenes and DriveLM-CARLA benchmarks demonstrate significant improvements in perception, planning, and prediction tasks compared to baseline models.
The paper introduces OmniGen, a unified framework for generating aligned multimodal sensor data for autonomous driving using a shared Bird's Eye View (BEV) space. It uses a novel generalizable multimodal reconstruction method (UAE) to jointly decode LiDAR and multi-view camera data through volume rendering. The framework incorporates a Diffusion Transformer (DiT) with a ControlNet branch to enable controllable multimodal sensor generation, demonstrating good performance and multimodal consistency.
This study introduces a reinforcement learning (RL) framework using Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) to optimize the cleaning schedules of photovoltaic panels in arid regions. Applied to a case study in Abu Dhabi, the PPO-based framework demonstrated up to 13% cost savings compared to simulation optimization methods by dynamically adjusting cleaning intervals based on environmental conditions. The research highlights the potential of RL in enhancing the efficiency and reducing the operational costs of solar power generation.
The paper introduces ILION, a deterministic execution gate designed to ensure the safety of autonomous AI agents by classifying proposed actions as either BLOCK or ALLOW. ILION uses a five-component cascade architecture that operates without statistical training, API dependencies, or labeled data. Evaluation against existing text-safety infrastructures demonstrates ILION's superior performance in preventing unauthorized actions, achieving an F1 score of 0.8515 with sub-millisecond latency.