• Learning materials

    • Survey of neural program synthesis: Advanced Machine Learning Day 3: Neural Program Synthesis - YouTube
    • MIT Lectures: Introduction to Program Synthesis
    • IISC Lectures: Program Synthesis meets Machine Learning
    • CVPR workshops: Neuro-Symbolic Visual Reasoning and Program Synthesis
  • Ways to bridge ML and program synthesis

    • Pure symbolic search, e.g. FlashFill—problem is exponential, must craft a DSL that balances expressivity and concision (for practically reducing search time)
    • Symbolic search guided by neural network that ranks, at each node in the AST, what are most likely. Truncate least likely. Explore several nodes breadth first to avoid dead ends. (Is this what RobustFill is?)
    • Pure neural search. E.g. the SQL work. Treat as machine translation. Use encoder-decoder with attention.
      • To guarantee programs that at least compile, can (a) tweak the standard beam search that generates the final output to also discard invalid strings-so-far, and (b) tweak the neural search itself to only consider, at each step, the valid next tokens [did not understand how this works].
    • Model programs as not just ASTs (which is an improvement over linear sequence of tokens), but as graphs that capture semantic information, such as what’s observed by compilers computing dataflow dependencies (also includes the linear sequence chain and AST tree)
  • Opportunities

    • Editing large code
    • Learning to use libraries empirically
    • Learning codebases with dynamic techniques
    • Generating domain specific representations multi-modally
  • TODO Unsorted

    • DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning - YouTube
      • Alternates between sleep and wake phases
      • Combines multiple techniques—symbolic search, neural search, and PGM
    • AI Coding with CodeRL: Toward Mastering Program Synthesis with Deep Reinforcement Learning
      • code
  • Spreadsheet applications

    • FlashFill
    • RobustFill: Deep Learning for Program Synthesis - Microsoft Research
    • SpreadsheetCoder: https://arxiv.org/abs/2106.15339
  • Generating code from visual representations

    • pix2code from uizard - paper with code
    • Sketching Interfaces – Airbnb Design - only a video and blog post
    • pix2struct: [2210.03347] Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
    • Understanding HTML with Large Language Models | Hacker News
  • Generating visual designs from natural language

    • GPT3 demo that later feeds into Magician
  • Competitive coding

    • DeepCoder: Learning to Write Programs - Microsoft Research
    • Competitive programming with AlphaCode
  • Understanding/modeling code

    • Polycoder: LLM trained only code exclusively, outperforms general LLMs (paper, code)

    • [1910.00577] Structural Language Models of Code instead of flat sequences

    • code2vec: code → AST → vector

      Untitled

  • Bibliography

    • [2007.03629] Strong Generalization and Efficiency in Neural Programs
    • [2207.11765] Neurosymbolic Repair for Low-Code Formula Languages
    • [2108.07732] Program Synthesis with Large Language Models
    • State of Deep Learning for Code Generation (DL4Code) | Victor Dibia
    • DeepCoder: Learning to Write Programs - Microsoft Research
    • Competitive programming with AlphaCode
    • OpenAI Codex
    • [2002.08155] CodeBERT: A Pre-Trained Model for Programming and Natural Languages
    • What is this…. [2109.00859] CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
    • Conversational AI Programming with CodeGen: Let AI Write Code For You
    • [2107.03374] Evaluating Large Language Models Trained on Code
    • ML-Enhanced Code Completion Improves Developer Productivity https://ai.googleblog.com/2022/07/ml-enhanced-code-completion-improves.html?m=1