DANAOS presented a holistic approach involving a variety of multidisciplinary frameworks that will facilitate the employment of a core module of the broader DT4GS framework, namely the OML (Open Model Library). More specifically DAN experimented with a set of steaming tools that aim to largely simplify and automate the way the various models and their associated data are described, as well as support the continuous Integration/Deployment of them. Furthermore, the specific workflow from data acquisition and processing to model training and deployment was defined in detail by implementing one of the most prominent working examples for environmental compliance and emission reduction, namely, Fuel Oil Consumption (FOC) approximation. In the context of FOC estimation the streaming capabilities of state-of-the-art frameworks like Kafka and Spark, was leveraged, for data processing and curation while a No-SQL Database (Mongo DB) was utilized accordingly to accelerate the storing and indexing of these data. After the appropriate processing of the data is completed, structured information in the form of a Knowledge Graph (Neo4J) is consumed (KH) that inter-connects the specific use case with relevant data as well as with the associated parameters describing the corresponding model (e.g Data class (sensory, telegrams, granularity etc), Data processing type, Model Type / Architecture (regression/ classification deep learning / machine learning etc), Training Type (continuous etc), Scalability (Eligible for CPU / GPU optimization – QUANTUM ANALOGS ), Security (certificates, accessibility), Integration (topology inside the DT ecosystem) ,Deployment (locally, on edge, as a WS, as a DLL , as JSON, ONNX, H5 etc)). Based on the initial Model Blueprint (JSON, XML, OASIS, TOSCA templates) constructed by the procedure described above, a corresponding simulation model (Deep Learning, Machine Learning, Analytical [Keras, Theano, Pytorch, Java, R, SimuLink]) is appropriately configured for training, in the context of the Models Repository ecosystem. The Models Repository is a web-based integrated environment (MlFlow) supported by a set of containers that attempts to largely automate and standardize the way the models, their different versions as well as their associated parameters (metadata, accuracy, train-test size, features) are described and provided to the end users (Shipowners, Software Developers, External Vendors, etc). With the utilization of this framework the user will be able to continuously monitor models’ performance while an automated administrative workflow orchestrated by Airflow will be responsible for the appropriate update and refinement of these models as new data are acquired from the LLs.
DT4GS has received funding from the Horizon Europe framework programme under Grant Agreement No 101056799.