CS700 Second Presentation

CS700 Second Presentation

28th January 1998

Andrew Broad

http://www.cs.man.ac.uk/~broada/

Supervisor: Nick Filer
CAD Group


  • Case-Based Reasoning applied to Automatic Programming
  • Automatic Programming applied to transforming constraints in information models

  • Automatic Programming

  • Generation of executable programs in ordinary programming languages from higher-level descriptions

    i.e. Computer, Program Thyself!

  • We seek descriptions that are:
    - very high level!
    - extremely declarative! (what rather than how)
    - abstract! (not too concerned with details such as algorithms and data structures)

  • Forward Engineering: High Level Description --> Program Code

  • Reverse Engineering: Program Code --> High Level Description


    Recap: Case-Based Reasoning

  • Solving new problems by analogy with old ones.

  • A Case-Based Reasoner remembers a previous similar problem and adapts its solution to solve the new problem.

    (diagram of CBR cycle)

    Advantages of Case-Based Reasoning (for Automatic Programming)

  • Learn from experience and improve capabilities!
  • Solve problems (write programs) quicker and easier!
  • Suggest solutions even in poorly-understood domains!
  • Warn of past mistakes and avoid them! (e.g. bugs)
  • Good for prototyping!

    Disadvantages

  • Large case libraries take up a lot of space!
  • Inappropriate retrievals could be counterproductive!


    Application: Transforming Constraints in Information Models

    What is Information Modelling?

  • An Information Model defines:
    - what entities (objects) exist in a domain
    - what attributes (fields/slots) those entities have
    - relationships between those entities/attributes
    - constraints on those entities/attributes

  • A schema is a collection of related entities and other gubbins (such as type definitions and global constraints)

  • Here's an example written in the information modelling language EXPRESS:
    SCHEMA University;
    	ENTITY Course;
    		code : STRING;
    		title : OPTIONAL STRING;
    		department : STRING;
    		number_of_students : INTEGER;
    	UNIQUE
    		code;
    	END_ENTITY;
    END_SCHEMA;
    


    Schema-to-Schema Mapping

  • Mapping data from one format to another

    (Data Repository)--->(Data Repository)

  • Schemata may differ structurally.

  • Schemata may differ semantically, i.e. in terms of their constraints. e.g.
    SCHEMA University_2;
    	ENTITY Department;
    		name : STRING;
    		courses : SET [1 : ?] OF Course;
    	END_ENTITY;
    
    	ENTITY Course;
    		code : STRING;
    		title : STRING;
    		students : INTEGER;
    	INVERSE
    		the_department : Department FOR courses;
    	UNIQUE
    		code;
    	WHERE
    		students >= 10 AND students <= 250;
    	END_ENTITY;
    END_SCHEMA;
    


  • Some records in the source repository cannot be transferred, because they would violate constraints in the target repository!

  • So, it would be useful to transform constraints on the target repository to apply to records in the source repository before transferring those records.

  • This is where automatic programming comes in - given models of the source and target schemata, generate a program to map from one to the other.

  • This has been done for structural transformations, but the problem of semantic difference (constraints) remains untackled!

  • These transformations will be carried out at a higher level of abstraction than on the constraints themselves, which involves a combination of reverse and forward engineering ("reengineering").

    (reengineering diagram)


    Some Research Questions

    Knowledge

  • What knowledge does an Automatic Programming system need?

                  `what'         `how'
             +--------------+--------------+
    specific | KNOWLEDGE OF | KNOWLEDGE OF |
             | THE PROBLEM  | THE PROGRAM  |
             +--------------+--------------+
     general | KNOWLEDGE OF | KNOWLEDGE OF |
             | THE DOMAIN   | PROGRAMMING  |
             +--------------+--------------+
    
  • What knowledge can be captured automatically from existing code? (reverse engineering)

    - Code topology
    - Call graphs
    - Control flow graphs
    - Data flow graphs
    - Variable reference graphs
    - Resource flow graphs
    - State-transition diagrams
    - Inheritance hierarchies
    - Module dependency graphs
    - Various metrics (e.g. cohesion and coupling)


    Architecture

  • What components should be included in an architecture for automatic programming?

  • What should be automated and what not?

  • Which parts should be case-based and which not?

  • From the CBR perspective:

    - What constitutes a case? (a whole program, or are parts of a program also cases in their own right?)

    - How should cases be represented and indexed?

    - How should the case library be organised such that retrieval is efficient as well as effective?

    - How should similarity be measured?

    - Is analogical reasoning needed?

    - Is adaptation needed, and if so, what kinds?

    - Is evaluation and repair of solutions required? If so, how should solutions be evaluated?

    - What is a sensible policy for memory update? (storing cases)

    - Is a multi-stage CBR process needed? If so, how should it be controlled?


    Back to CS700
    Hosted by www.Geocities.ws

    1