NIH Workshop on Reaction Informatics, May 18-20, 2021
This workshop will be for people, groups, and companies that are interested in sharing ideas about how reaction(-related) data are represented, captured, managed in databases, analyzed, used for drug design, applied in robotics, and exchanged locally as well as globally. Our intention is not to just have a number of product demos, but rather to have speakers focus on the science of reaction informatics, and present the background, achievements, and remaining challenges of their projects.
The virtual workshop is scheduled for three half-days on May 18-20, from 10:45 AM to 3:30 PM EDT. Registration is free.
NIH Virtual Workshop on Reaction Informatics, May 18-20, 2021, 11AM-3PM EDT [Registration]
Workshop webpage: Agenda, abstracts, presentations & recordings has been added: https://cactus.nci.nih.gov/presentations/NIHReactInf_2021-05/NIHReactInf.html
Wendy Warr's Meeting Report at ChemRxiv (Aug, 19, 2021)
https://cactus.nci.nih.gov/presentations/NIHReactInf_2021-05/NIHReactInf.html
References:
[1] Judson, P.N., Ihlenfeldt, W.-D., Patel, H., Delannée, V., Tarasova, N., Nicklaus, M.C., 2020. Adapting CHMTRN (CHeMistry TRaNslator) for a New Use. J. Chem. Inf. Model. 60(7), 3336–3341. https://doi.org/10.1021/acs.jcim.0c00448 (also in chemrxiv)
[2a] Hoonakker, F., Lachiche, N., Varnek, A., Wagner, A., 2011. Condensed graph of reaction: considering a chemical reaction as one single pseudo molecule. Int. J. Artif. Intell. Tools 20(2) , 253–270.
[2b]Nugmanov, R., Sattarov, B.,... Varnek, A., 2017. CGR-DB. Interactive Reaction Database and CGR-Based Search Engine., in: Antipin, I.S. (Ed.), 3rd Kazan Summer School on Chemoinformatics Kazan, 5-7 July 2017 Kazan Federal University, p. 64 .
[3] Yang, C., Tarkhov, A., Marusczyk, J., Bienfait, B., Gasteiger, J., Kleinoeder, T., Magdziarz, T., Sacher, O., Schwab, C.H., Schwoebel, J., Terfloth, L., Arvidson, K., Richard, A., Worth, A., Rathman, J., 2015. New Publicly Available Chemical Query Language, CSRML, To Support Chemotype Representations for Application to Data Mining and Modeling. J. Chem. Inf. Model. 55, 510–528. https://doi.org/10.1021/ci500667v
Chemotypes are a new approach for representing molecules, chemical substructures and patterns, reaction rules, and reactions. ...Chemotypes are expressed in the XML-based Chemical Subgraphs and Reactions Markup Language (CSRML), and can be encoded not only with connectivity and topology but also with properties of atoms, bonds, electronic systems, or molecules…. A software application, ChemoTyper has also been developed and made publicly available in order to enable chemotype searching and fingerprinting against a target structure set.
[4a] Tremouilhac, P., Huang, P.-C., Lin, C.-L., Huang, Y.-C., Nguyen, A., Jung, N., Bach, F., Bräse, S., 2021. Chemotion Repository, a Curated Repository for Reaction Information and Analytical Data. Chemistry–Methods 1 (1), 8–11. https://doi.org/10.1002/cmtd.202000034
[4b] Tremouilhac, P., Lin, C.-L., Huang, P.-C., Huang, Y.-C., Nguyen, A., Jung, N., Bach, F., Ulrich, R., Neumair, B., Streit, A., Bräse, S., 2020. The Repository Chemotion: Infrastructure for Sustainable Research in Chemistry. Angewandte Chemie International Edition 59 (50), 22771–22778. https://doi.org/10.1002/anie.202007702
[5a] Batchelor, C., 2020. Reaction ontologies and artificial intelligence. Presented at AI3SD AI React 2020, Bristol, UK, March 9-11, 2020 (Presentation, 31 p., abstract, Wendy Warr’s conference report, at p. 21-24)
[5b]Royal Society Chemistry. RXNO: reaction ontologies (https://github.com/rsc-ontologies/rxno) [2020-06-08]. Consists of RXNO, the name reaction ontology and MOP, the [underlying] molecular process ontology, terms of both are browsable at EMBL-EBI Ontology Lookup Service (OLS). RXNO consists of 901 RXNO terms (as of 2021-01-21), such as Diels–Alder cyclization and 3682 MOP terms (as of 2014-09-03), for example cyclization, methylation and demethylation.
[6a] Grethe, G., Blanke, G., Kraut, H., Goodman, J.M., 2018. International chemical identifier for reactions (RInChI). Journal of Cheminformatics 10 (1), 22 (9 p.). https://doi.org/10.1186/s13321-018-0277-8
[6b] Grethe, G., Goodman, J.M., Allen, C.H., 2013. International chemical identifier for reactions (RInChI). J Cheminform 5 (1), 45 (9 p.). https://doi.org/10.1186/1758-2946-5-45
[6c] Jacob, P.-M., Lan, T., Goodman, J.M., Lapkin, A.A., 2017. A possible extension to the RInChI as a means of providing machine readable process data. Journal of Cheminformatics 9 (1), 23 (12 p.). https://doi.org/10.1186/s13321-017-0210-6
[7a] Ambit-SMIRKS, http://ambit.sourceforge.net/smirks.html
[7b] Kochev, N., Avramova, S., Jeliazkova, N., 2018. Ambit-SMIRKS: a software module for reaction representation, reaction search and structure transformation. J Cheminform 10, 42. https://doi.org/10.1186/s13321-018-0295-6
[7c] Jeliazkova, N., Kochev, N.T., Rydberg, P., Avramovaba, S., 2013. Reaction Representation and Structure Transformation with Ambit-SMIRKS. Application in Metabolite Prediction. A poster presented at the OpenTox Euro 2013, Mainz, Germany, Sep.30- Oct. 2, 2013, 1 p.
[7d] Jeliazkova, N., Kochev, N., 2011. AMBIT-SMARTS: Efficient Searching of Chemical Structures and Fragments. Molecular Informatics 30, 707–720. https://doi.org/10.1002/minf.201100028
[8] n/a
[9] Delannée, V., Nicklaus, M., ReactionCode: a new versatile format for searching, analysis, classification, transform, and encoding/decoding of reactions. ChemRxiv 24 p. V3. Revised July 22, 2020 https://chemrxiv.org/ndownloader/files/24030893
[10a] Tomczak, J., 2020. UDM: A Community-Driven Data Format for the exchange of Comprehensive Reaction Information. Presented at AI3SD AI React 2020, Bristol, UK, March 9-11, 2020 (Presentation, 34 p., abstract, Wendy Warr’s conference report, at p. 24-26)
[10b] Pistoia Alliance, 2019. UDM URL https://www.pistoiaalliance.org/projects/current-projects/udm/ (accessed 5.10.21)
[10c] UDM XML Schema v.6.0.0 includes definitions for reaction classes (based on the RSC RXNO reaction ontology) and methods and results types (both based on Allotrope Foundation Taxonomies (AFT) WD/2019/12)
[11] n/a
[12a] NextMove Software, Pistachio: Reaction Data, Querying and Analytics. Version 2021-04-03 (2021Q1) (include Youtube video, 4:05
[12b] Mayfield, John., 13,118,970 Reactions and Counting. NextMove Software Blog, March 24, 2021 [The April 2021 release] of Pistachio will contain 13,118,970reactions from the following sources: USPTO Text: Grant 3,290,056 Appl.3,595,510; USPTO Sketch: Grant 1,186,924, Appl. 1,804,859; WIPO PCT Text 1,484,646; EPO Text: Grant 1,060,397, Appl. 696,578. Number of unique reactions by RInChI is 4,212,894.]
[12c] Sayle, R., Mayfield, J., Lagerstedt, I., Lowe, D., 2020. Automated mining a database of 9.3M reactions from the patent literature, and its application to synthesis planning. Presented at AI3SD AI React 2020, Bristol, UK, March 9-11, 2020 (Presentation, 80 p., abstract, Wendy Warr’s conference report, at p. 38-42)
[12d] Mayfield, J., O’Boyle, N., Sayle, R.(NextMove Software), Pistachio - Search and Faceting of Large Reaction Databases. Presented at ACS Fall 2017, Washington, D.C. [CINF 13], August 2017, 30 p.
[13a] ChemPass Ltd., SynSpace [webpage]. [SynSpace is the first versatile design platform that harness the power of rule-based AI for forward reaction-based design.]
[13b] Makara, Gergely. Chempass presentation on Platinum Global Solutions (PGS) 5th Drug Discovery Summit E-Conference 31th March 2021, Youtube, 35 min.
[13c] Makara, G., 2019. AI-assisted lead optimization with SynSpace. Presented at 3rd Global Engage Pharma R&D Informatics & AI Congress, London.UK, October 28-29, 2019. 35 p.
[14a] Patel, H., Ihlenfeldt, W.-D., Judson, P.N., Moroz, Y.S., Pevzner, Y., Peach, M.L., Delannée, V., Tarasova, N.I., Nicklaus, M.C., 2020. SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules. Scientific Data 7(1), 384. https://doi.org/10.1038/s41597-020-00727-4 (Open Access) [While LHASA is retrosynthetic, SAVI is strictly forward-synthetic. This implied the task to make LHASA transforms, which are written for retrosynthetic application, work in a forward-synthetic context]. See also: Nicklaus, Marc., 2020. The story behind the SAVI project. Research Data at Springer Nature Blog, Nov 21, 2020
[14b] See also [1] above.
[14c] Pevzner, Y., Ihlenfeldt, W.D., Nicklaus, M.C., 2016. Synthetically Accessible Virtual Inventory (SAVI). Presented at Seventh Joint Sheffield Conference on Chemoinformatics, Sheffield, UK, July 4-6, 2016 (Abstract) (Presentation).
[15] Clark, Matthew, Finding the Corpus of Knowledge for Machine Learning/AI In Chemistry. Posted on LinkedIn. January 7, 2020.
[16a] Lemonick, S., 2020. CAS opens data vault to MIT scientists. C&EN Global Enterp 98 (44), 6. https://doi.org/10.1021/cen-09844-scicon1 (also in an online CEN edition); CAS. CAS to collaborate with MIT on research to enhance predictive chemical synthesis planning. Press Release, Oct. October 29, 2020
[16b] See also: CAS AI & Machine Learning (web-page). [Bayer scientists increased prediction accuracy by 32 percentage points with scientist-curated data from CAS. Enhanced predictive power in rare reaction classes contributes new, useful results to open up difficult areas of science (see CAS-Bayer White Paper, 2021, 10 p.)]
[17a] Open Reaction Database (ORD) (Documentation); ORD Search/Browse (SMILES/SMARTS)
[17b] Coley, C., Kearnes, S., 2020. The Open Reaction Database. RDkit 2020 Virtual UGM, Oct.7, 2020 (Presentation, 32 p.; video, 28:16)
[18] Fooshee, D., Mood, A., Gutman, E., Tavakoli, M., Urban, G., Liu, F., Huynh, N., Vranken, D.V., Baldi, P., 2018. Deep learning for chemical reaction prediction. Mol. Syst. Des. Eng. 3 (3), 442–452. https://doi.org/10.1039/C7ME00107J
[19] Schwaller, P., Hoover, B., Reymond, J.-L., Strobelt, H., Laino, T., 2021. Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Science Advances 7 (15), eabe4166 (10 p.). https://doi.org/10.1126/sciadv.abe4166 [......Transformer Neural Networks learn atom-mapping information between products and reactants without supervision or human labeling. Using the Transformer attention weights, we build a chemically agnostic, attention-guided reaction mapper and extract coherent chemical grammar from unannotated sets of reactions.]
[20a] Warne, Mark (2020) AI3SD Video: Digitising your Chemistry for Recordability, Shareabilty and Reproducibility. Kanza, Samantha, Frey, Jeremy G., Hooper, Victoria and Knight, Nicola (eds.) AI3SD, PSDS Patterns Failed it to Nailed it: Getting Data Sharing Right Seminar Series 2020, Southampton, UK. 22 Oct - 03 Dec 2020. (doi:10.5258/SOTON/P0069). (See also a conference report, p.8-9)
[20b] DeepMatterTM. DigitalGlassware® [A unique part of DigitalGlassware® is the time-course data collection approach. Actions during your DigitalGlassware® Recipe Run are time-stamped against the sensor data, providing context that would otherwise be missing.]
[21a] Coley, C., ASKCOS: data-driven chemical synthesis. Presented at AI3SD AI React 2020, Bristol, UK, March 10, 2020 (Presentation, 62 p., abstract, Wendy Warr’s conference report, at p. 38-42)
[21b] Coley, C.W., 2020. Chapter 15:Data-driven Prediction of Organic Reaction Outcomes, in: Artificial Intelligence in Drug Discovery. pp. 327–348. https://doi.org/10.1039/9781788016841-00327 (Preview at GoogleBooks eBook)
[22] Gromski, P.S., Granda, J.M., Cronin, L., 2020. Universal Chemical Synthesis and Discovery with ‘The Chemputer.’ Trends in Chemistry 2(1), 4–12. https://doi.org/10.1016/j.trechm.2019.07.004 (open access)
[23] NCATS ASPIRE Laboratory. Informatics. […the ASPIRE (A Specialized Platform for Innovative Research Exploration) initiative is building a high-quality reaction knowledgebase via the integration of historical and high-throughput synthesis data…. ASPIRE prioritizes the early dissemination of novel reaction informatics methods and potentially public data sets]
[24] Mahjour, B., Shen, Y., Liu, W., Cernak, T., 2020. A map of the amine–carboxylic acid coupling system. Nature 580, 71–75. https://doi.org/10.1038/s41586-020-2142-y
[25a] Masquelin, T., Kaerner, A., Bernhardt, R.J., Wang, J., Nicolaou, C.A., 2021. Automated Synthesis, in: Burger’s Medicinal Chemistry and Drug Discovery. American Cancer Society, pp. 1–37. https://doi.org/10.1002/0471266949.bmc261
[25b] Nicolaou, C.A., Watson, I.A., LeMasters, M., Masquelin, T., Wang, J., 2020. Context Aware Data-Driven Retrosynthetic Analysis. J. Chem. Inf. Model. 60, 2728–2738. https://doi.org/10.1021/acs.jcim.9b01141
[26a] SynFiniTM Automated Chemistry Platform, SRI International. [Includes “SynRoute™, a computational synthetic planning tool that provides synthetic strategies toward compounds of interest [which utilises AI/big data and machine learning]”
[26b] Collins, N., SynFini: An Automated Chemical Synthesis Platform. SRI International. Technical report for a grant W911NF-16-C-005. Apr. 6, 2020. 109 p. (See Task 1: SynRoute-Knowledge-Based Route Design, Planning, and Automation, p.19-33; Applications of Make-It Technologies to the Ongoing COVID-19 Outbreak [use of SynRoute], p.8-19)
[26c] Madrid, P., Collins, N., Latendresse, M., Malerich, J., Krummenacker, M., 2019. Computational Generation of Chemical Synthesis Routes and Methods. WO2019156872 (PCT/US2019/015868) Published 2019-08-15
[26d] Latendresse, M., Madrid, P., Krummenacker, M., Malerich, J., Karp, P., Collins, N., SRI International, Integrating AI with Robust Automated Chemistry: AI Driven Route Design and Automated Reaction & Route Validation. Presented at AI3SD AI React 2020, Bristol, UK, March 9-11, 2020. (Abstract; Wendy Warr’s conference report, at p. 42-45)
[27a] Spaya AI-powered retrosynthesis platform by Iktos https://spaya.ai/
[27b] Tajmouati, H., Parrot, M., Skiredj, A., Fourcade, R., Do-Huu, N., Perron, Q., Gaston-Mathé, Y., Integrating data-driven computer-aided synthetic planning with AI-based generative drug design. A poster presented at Chemical Science Symposium 2020, 29 - 30 September 2020, UK, 1 p.
Updated: 5/20/2021 10:40 PM Additional information (references) has been added to all presentations on the updated agenda.