CS710 Summary

ANDREW BROAD'S CS710 SUMMARY

Week 1 Week 2 Week 3 Week 4 Week 5 Week 6

Week 1 (18th February 1998)

This was basically an organizational meeting for CS710, to establish who's doing it and what the course will cover. Like CS700, it is run by David Brée.

Some of the students are MPhils coming off CS700, while others are in their final year, writing up their PhDs. Some of them even did CS710 in their Masters year, and are coming back for a second dose - "It'll force me to write my thesis!" (anonymous).

The deliverables for CS710 will be the abstract of each student's thesis, the table of contents, and at least one chapter (usually the first or the last), plus they will have to give a presentation of their thesis plan.

It seemed to me that the PhDs have already started writing their thesis, whereas the MPhils (at least the September starters) haven't started yet (I know I certainly haven't - I want to get clear on the big picture of my thesis before I start actually writing it up, and I don't think I'm 100% clear yet).

As usual for a Level 7 module, the emphasis is on learning by doing - to have a go at writing our theses, and to read others' theses and try to see what's wrong with them! We should consider where we are now with respect to writing our thesis, think about where we want to be by Easter, and try to reduce the difference, rather like the General Problem Solver (GPS), an old AI system.

Rather than following a set syllabus, David Brée will try to teach what we the students want to learn. Today that question was put to the group, and discussed for the rest of the seminar.

It seems that what we want to learn about thesis writing basically falls into two categories: motivation and the architecture of the thesis.

Motivation

Motivational issues include how to get started on writing a thesis (beginning is the hardest part), and keeping going once started - thesis writers tend to run out of steam at times, and find themselves in periods where they will do anything to avoid working on it, like making endless cups of tea, watching the telly or surfing the Internet - we all know the feeling, I suspect.

Equally, some thesis writers are apt to write the first chapter or two and then spend months perfecting it when they should be forging ahead with the rest of the thesis! (I felt that a little bit when writing my CS3900 project report last year.)

Thesis Architecture

Architectural issues in thesis writing include working out the big picture, how to put the details into the big picture, where to put the sordid details (the first and last chapters should concentrate on the big picture, so the answer is somewhere in the middle), how to bring the strands of the thesis together, and how to write up experiments.

Multi-disciplinary theses are more challenging to write than theses which concern one field, in certain ways. For example, my primary research interest is Case-Based Reasoning. I'm looking to apply that to Automatic Programming - a fairly novel cross-fertilization. Case-Based Automatic Programming will be applied in turn to a simplified, concrete exemplar: transforming constraints in information models for schema-to-schema mapping. So I guess I'll need a chapter to introduce Automatic Programming (which is the big picture of the thesis, though not of my research interests), one to introduce CBR and one to introduce the constraints problem before I get to the heart of the thesis.

Writing The Thesis

There are no God-given rules for thesis writing - at best you can follow a set of heuristics and try to avoid mistakes other people have made in the past.

Of course the University does have some regulations for theses - these are on the Web at http://www.cs.man.ac.uk/rgd/David/CS710/theses.html, but these are the rubric rather than the rhetoric.

Where the research falls on the science v engineering continuum (as discussed in CS700) also comes into play in thesis writing. Are you doing experiments? pure theory? engineering? If you are doing an `engineering' type project (i.e. building a system to do something) then you need to try to get the most science out of it as possible, by generalizing it. Research Associates in particular suffer from this problem.

People who have written a big, dirty system need to write it up coherently, as they have a tendency to keep repeating themselves.

An important issue is how much background knowledge to put in the thesis and how much to assume the reader already has. This depends on who you are writing for - in particular, who is going to be your internal examiner, so you should find out who that is going to be and find out about them (e.g. are they an expert in the field? are they a bit surreal?).

So, one issue in thesis writing is what to leave out! Some things (e.g. certain background for the benefit of the reader, as they won't want to bother going to read it from another source) could be put in appendices (a case that springs to mind is Ian Pratt's fabulous Artificial Intelligence book, which had a chapter on predicate calculus consigned to an appendix as he didn't want to disrupt the flow of his book). As for myself, I tend to put certain sordid details in appendices - things that aren't necessary to understand the thesis as a whole but I want to include for completeness (see my CS3900 project report for example).

Incidentally, the choice of internal/external examiners is an important one, as the wrong choice could be a big mistake if they are going to bounce the thesis! You don't have the right to make that choice yourself, but you can certainly influence the decision!

Another thing PhD candidates worry about is how to reassure yourself that you've got all the relevant references, that there's not some paper out there somewhere that someone has done basically the same work as you and you don't know about it, or something important that should be taken into account in your research. The short answer is: you can't. There's no way you can be sure, and you just have to hope that the examiners wouldn't find it either!

Finally, bear in mind that, although writing a thesis is an open-ended task, it's not your life's work! You have to close it off somehow, sometime, so one of the issues addressed in CS710 will be how to stop a thesis.

Your Supervisor

Obviously, you will want your supervisor to read drafts of your thesis before you decide to submit it, and you will expect to get feedback, including constructive criticism.

Hopefully, you will have a wise supervisor, but that doesn't mean they're always right! Particularly in trivialities such as "don't use abbreviations" or "don't use I, you or we". Another problem is what to do if your supervisor says that each draft you show him is fine, without giving any real (negative) feedback, and you're worried that it's not fine really. A trick David Brée suggested is to throw in a completely garbled paragraph and see what happens!

The Textbook

The textbook for CS710 is Linda Flower's Problem Solving Strategies for Writing (Fourth Edition), Harcourt Brace 1992. This is a good book to go to with any problems you have with respect to writing your thesis, as it treats thesis writing as a technical problem, which can be solved just like any other problem. It includes case studies of people writing their theses. It can be found in the library, according to David Brée, but Blackwells don't stock it - I've had to order it.

What's coming up in CS710?

CS710 has a website at http://www.cs.man.ac.uk/rgd/David/cs710.html, which was recently updated for the first time since 1996! The agenda is at http://www.cs.man.ac.uk/rgd/David/CS710/thisyear.html, and I trust it will be updated throughout the semester.

Over the next few weeks, CS710 will try to address the issues that were raised in this meeting, plus two standard events:

David Brée will get someone who has just completed their thesis to come in and talk about how they did it! While this can be useful, the caveat is that all theses are different, and depend very much on the individual.

The abstract and a chapter from a couple of actual MPhil and PhD theses will be given to us as good exemplars.

I would also be interested to see some theses that failed, together with an analysis of what was wrong with them. I think that would help us to avoid mistakes.

Week 2 (25th February 1998)

There was a handout today, a yellow booklet called "How to Write a Thesis: Advice on the Preparation of Continuation Reports, MSc and PhD Theses". Three caveats are in order, however:-

The booklet was cowritten by the School of Biological Sciences, so much of it is specific to Biology rather than Computer Science (in particular the emphasis on experiments, which reminds me of high school science lessons!). However, I think it is worth reading to take what you can from it!

The booklet was written in 1994, so it might be out of date (in particular, MPhils were still called MScs in those days). This brings into question the up-to-dateness of the rest of the information.

The booklet gives guidelines and advice, but is not a substitute for the rules and regulations!

Are the REGULATIONS FOR THE PRESENTATION OF THESES AND DISSERTATIONS at http://www.cs.man.ac.uk/rgd/David/CS710/theses.html all we need to know or do we ought to get a copy of the "University Regulations" from the Examinations Office?

Motivation and Planning

Writing a PhD thesis is a daunting task because:

It has to be large.

It has to be right.

In today's seminar, we discussed what were the expectations of a PhD thesis, in particular, the candidate's expectations of their own thesis.

A PhD candidate expects, at the minimum, for their thesis to get them a PhD. In addition, it's nice if it's not boring!

A PhD candidate expects their thesis to be "good". This goodness boils down to two sorts of correctness:-

Intrinsic correctness: Every statement in the thesis is scientifically `watertight'. This is important for a PhD (not so much so for an MPhil) because, in the viva (oral exam where you defend the thesis you have submitted), the examiners could ask you to justify any statement you made in the thesis!

(There's no viva for an MPhil thesis, is there?)

Extrinsic correctness: What this means basically is you have covered all the relevant literature (and referred to it in the thesis). This is something you can never be 100% sure of, so you should aim for 90-99%.

Intrinsic correctness is much more important than extrinsic correctness in terms of passing the PhD examination - they're not going to torpedo you if you miss some obscure reference (and the chances are, if you haven't come across it in three years of burying yourself in this stuff, then neither will they!)!

It's very important that a thesis is a ROUNDED STORY.

A thesis is, by ancient definition, a coherent argument, though the word `thesis' is often used to refer to the physical report in which the thesis itself is laid out.

A thesis is an argument that can be summarized in a nutshell. So you start by writing the Abstract, in which you make a claim (or state a problem to be solved) and then, in the body of the thesis, you prove the claim (or show how you solved the problem). As mentioned in CS700, a PhD doesn't have to be a revolution, in fact it shouldn't be unless you are outstandingly brilliant! (Most people get their PhDs by making a `normal science' contribution, and succeed through determination and perseverance rather than sheer brilliance.)

Reminder of the Criteria a PhD must satisfy

As discussed in CS700, a PhD has to be:-

Original.

Scientific. You show this by analysing/evaluating it.
- You have to show that the problem is oof scientific interest within a certain setting (a wider context). So it's important to decide for whom you are writing the thesis - which scientific audience are you addressing?
- It's also in how you interpret (look aat) the data.

Choice of Examiners

Technically, it's not the candidate's decision who is their external examiner, but you can make suggestions to your supervisor, or veto their suggestions, e.g. "I don't want him, he's erratic!"
- "A rat?"
- "No, ERRATIC!"

You need an examiner who is sympathetic to your approach to science.

It's important that the external examiner is an expert in the area for which you wish to be examined, especially if your work is interdisciplinary. For example, if what you're doing is a cross-fertilisation of Computer Science and Physics, then don't get an external who's a Physics specialist if your work is more of a contribution to Computer Science! I guess I have a kind of similar thing myself, as my research concerns the application of case-based reasoning to automatic programming. Many PhDs who fail come to grief on this particular reef.

On the other hand, if the examiner is not an expert in your field then they are less likely to be able to catch you out - for gaps in your literature knowledge, for example. It's nice to feel that you're the expert, when there is a dearth of expertise in the country - this is particularly the case in Case-Based Reasoning!

Time and Length Planning

A PhD thesis (the report) can be as short as 80 pages or as long as 350. David Brée likes a PhD to be under 120 pages, whereas Hilary Kahn won't accept less than 200 pages (she's in the area of large systems, and standards, which need a lot of description!).

Bear in mind that the longer a thesis, the more difficult it is to hold it together - both the argument logically and the report physically! :-D

The rule of thumb is that it takes one working day to write (and later redraft) each page, so don't postpone starting for too long. I must confess I haven't started writing my MPhil thesis yet (I hope to have it finished by the end of September), but I should be writing a page a day by May!

Don't expect to get it right on the first draft - expect a lot of rewriting!

The important message is to allow plenty of time for writing the thesis: allocate at least 10% more time than you think you'll need! (In my experience, most tasks in life take longer than you think beforehand - I'm always having to postpone things! Getting down to them is the hardest part.)

Getting Started

Getting started is difficult because there's no algorithm for beginning. You have to develop an `algorithm' for writing your thesis as you go along.

Like most tasks when you don't know what to do, break it into smaller parts: into chapters, into sections, into subsections. Then write the little chunks of text when you get excited! (you don't necessarily have to write them in order!). Take a top down view of each chapter, and get the general plan clear in your mind before you write the text.

I wholeheartedly agree with this approach of splitting the thesis at the top level and working out the hierarchy before writing the text (rather than just writing the whole thing from beginning to end) because I think it's more enjoyable this way as well as helping to keep it coherent. I used this approach for my second CS3411 essay and it was pure fun to write!

In her book Problem Solving Strategies for Writing, Linda Flower suggests the use of an issue tree to formulate an argument. This was illustrated in the lecture with the example of constructing magic squares (which I won't regurgitate here, 'cause I don't like it). Again, it's the idea of breaking a big problem into smaller parts when you get stuck.

A sort of multiple inheritance occurs when a problem is a cross-fertilisation of two disciplines (e.g. Physics and graphics). But again, you just break the problem into smaller parts.

Group Planning

When your enthusiasm wanes, go show your thesis plan to someone (perhaps your supervisor, maybe someone else or several people) and get them to ask questions to rekindle your enthusiasm.

There are two kinds of thesis plan you should make:

Structure of the thesis (chapters, sections and subsections);

Plan of action (what you're going to do, when).

Week 3 (4th March 1998)

Today's seminar was primarily about the table of contents for a thesis, and an example Contents and Abstract from an MSc II (the pre-1997 equivalent of an MPhil) was given as a handout.

First, however, David Brée went over the points the examiners are looking for in Masters theses, which I have written down as best I can.

Does anyone have a verbatim copy of these points, or are they on the Web somewhere? If you are David Brée reading this, please give them as a handout!

Examiners' Questions for an MSc I

This is referring to taught Masters courses (MSc Method I), where the students do six months of taught modules, followed by six months of research resulting in a dissertation. So it's really more relevant to CS699 than CS710!

1. Does the thesis show satisfactory experience of research methods as can be gained in one year?

2. Has the work been carried out in a satisfactory manner?

3. Is there a discussion of the purpose of the investigation, with reference to previous work?

4. Is the thesis satisfactorily presented, with diagrams and references? (see Nicholas Higham's book Handbook of Writing for the Mathematical Sciences)

Examiners' Questions for an MPhil

This is referring to a Masters by research, formerly known as MSc Method II, where students do twelve months of research resulting in a thesis, which is what half of us in CS710 are doing.

1. Has the candidate been successful in achieving his (or her) aims and objectives? This means that it is important to be clear at the beginning and the end of the thesis, and link the conclusion back to the beginning.

2. Has the candidate shown originality and independent critical judgement? You have to relate the literature review to your work, and show how each reference is relevant to you - make it clear why you need to review their work. The purpose of the literature review is not to show that you're familiar with the literature of your general field, it is to analyse the work of others that is relevant to yours, so only include those references that are necessary. Don't include irrelevant stuff. Don't include crap stuff (articles are the right place to flame somebody's work, not theses!). Do include the external examiner's stuff. Therefore, don't have a crap external examiner!

3. Does the research reported in the thesis constitute an addition to knowledge? This is rather difficult to judge, as "addition to knowledge" is a very open-textured concept. It should be a useful and scientific contribution, but it does not have to be all that significant a contribution for an MPhil, and not even for a PhD! Some people think that the contribution of a PhD has to turn the subject upside-down and leave it rocking on its foundations, but that's a vast exaggeration! For a PhD, it's more important to show that you're a fully professional researcher with an expert grasp of your field, and knowing the boundaries so that you can extend them as needed. So don't get too paranoid about not making a novel contribution. Since Computer Science is actually very much an engineering subject, we have much less of a problem in this regard than, say, pure mathematics, where your novel contribution does go out the window if someone comes up with the same equation as you and publishes it first!

Bear in mind that all new knowledge is essentially just a combination of old knowledge!

An MSc II Thesis

We were given a handout today of the contents, abstract and a snippet from the first chapter of Georgios Paliouras's MSc (today it would have been an MPhil) thesis, Scalability of Machine Learning Algorithms, one of David Brée's charges (he went on to get a PhD in 1997). This thesis compares five machine learning algorithms to decide whether they scale up.

Paliouras's MSc was an evaluation thesis rather than a new contribution (it analyses the scalability of existing algorithms rather than inventing any new ones). This is quite common for Masters theses, because an evaluation thesis is easier to write (but it's so boring and devoid of thrills!). The contribution of Paliouras's MSc was that nobody had ever done such a thorough review of the field of Machine Learning before.

The Table of Contents

This is the easiest part of the thesis to do and redo, and it's also the natural way to plan the structure of your thesis. It puts each section of the thesis into context, and you can see what needs to be done if you start by writing the contents. Try it today!

The table of contents should be understandable in and of itself. I was able to get the gist of Paliouras's thesis just by skimming through the contents, because I'm very intelligent and I know a lot about AI (in fact, the field of Machine Learning is not unrelated to Case-Based Reasoning!). But it's also credit to Paliouras that he wrote a good contents!

The only criticism of this table of contents is nit-picking things like the style of his headings: it uses a lot of abbreviations, which impair understanding if you don't know what they mean. For example, even David Brée didn't know what PLS1 stands for without looking it up, so avoid trips down jargon lane!

Some people think you should avoid capital letters in headings wherever possible, but I think it's just important to have a case convention and stick to it consistently. For example, Section 1.4 is titled "Motivation for the project", whereas Section 1.5 is titled "The Structure of the Thesis", which is rather ugly because it's inconsistent.

Another point about Paliouras's thesis, judging just by the table of contents, is that Section 2.4 is only weakly connected to Chapter 3, even though they're both obviously about machine learning algorithms.

Do not underestimate the importance of the table of contents! The examiners usually read the contents second (after reading the abstract), and keep referring to it throughout their reading of the thesis, so the contents should be a good road map of the thesis! In certain ways, you're a bad judge of the comprehensibility of your own table of contents (and of your writing in general), because you know it all, and may have omitted something that's obvious to you but possibly unbeknown to the reader. So it's a good idea to get someone else to read it!

Main Body of Thesis versus Appendices

Think of a thesis as a detective story, where the author states a problem and then sets out to solve it! The thesis should have a fascinating story to tell, and you should avoid deviating from the storyline too much in the main body of the thesis because it weakens the impact of the message you're trying to convey.

So you should consider putting stuff which is boring but essential background in appendices rather than in the thesis itself. The best example I've seen of this is in Ian Pratt's Artificial Intelligence book, where he relegates a tutorial on the predicate calculus to the Appendix so that that mundane stuff, with which some but not all readers will be familiar, does not disrupt the flow of his exciting book as he goes on to discuss logic and inference in Chapter 3!

Things like experimental results (data and graphs) and protocol analysis (a transcript of someone's thoughts as they solve a problem, which can provide useful insights for AI) should be put in appendices, including a sample in the main text where you're trying to make a point about the results.

Graphs

Because Paliouras's thesis involved experiments to determine the scalability of algorithms, he plotted his results as graphs (one continuous variable against another), and David Brée disrupted the fascinating storyline of today's seminar to discuss points about the graphs, like how they were plotted on a log-log scale to make O(n log n) look linear, how each point plotted included standard deviations as well as the point itself, and how the speed was not so much in the computational complexity (all the algorithms are O(n log n)) but in where they start along the axis, and how you can explain funny blips in the graph (e.g. the bounds of a matrix were exceeded, and you had to double its size or whatever). The place for graphs is in appendices.

Week 4 (11th March 1998)

A Theoretical PhD Thesis

Today, we looked at a PhD thesis on formal methods: "Proving That Computer Programs Terminate Cleanly", which was written by Richard Sites in the 1970s (at least 1974, deducing from the citations). It was recommended by Cliff Jones to be the best PhD of its kind.

The strange things that struck me about it were that it was typewritten rather than word-processed as it would surely have been in this day and age, and the thesis was so short - 65 pages for the main text is shorter than my CS3900 report!

The other thing that jumps out at you when you peruse the table of contents is that the literature review is in Chapter 5, which is unusually late, since most theses have the literature review in Chapter 2. This implies that his work can be understood independently of others'.

The purpose of the literature review is to show that you're an authority in your field. It should not be a list describing references for the sake of it, but really tying it into your thesis. It's nice if it shows an insight, such as a novel classification of the literature. You should step back from your subject and look at what it really means rather than how we normally think of it.

The Abstract

The abstract for a thesis must be confined to one page, so it's quite a short piece of text to put the gist of the thesis across. Sites's abstract is particularly short, as it only fills about two-thirds of the available space!

In essence, the abstract has to say two things:

What you have done. And what results you have achieved.

Why you did it. This could be in terms of application (e.g. Sites's work has wide application to high-level languages), or a contrast to other work (e.g. Sites's work complements work on program correctness, but deals with termination rather than correctness properties).

You should begin the abstract with a major claim, saying what you have done in the first sentence: "A system of techniques is presented for proving mechanically that a computer program terminates cleanly." Then you should say exactly what you have done: "Clean termination means that the program has no infinite loops and no semantic errors."

The abstract should give the motivation for what you have done. It should say what artifact you have implemented and how you did it. Note that Sites's abstract says nothing about results - it omits to mention that there is no computer implementation of the techniques, and it doesn't mention what results were achieved using the techniques (no doubt he had his reasons, but this comes across as a somewhat incomplete abstract).

It's very important in your thesis to be clear about what you have done and how you know you've achieved it. The two extremes for evaluation are deduction and hypothesis testing. There are various ways of testing IT systems, such as doing experiments, getting users to fill in questionnaires, and ethnographic methods (I've no idea what those are!). In areas such as information systems, sometimes there's no choice as to how to evaluate systems!

Re: Note to the reader

This is a most unusual section that Sites has included, a sort of meta-level section, the like of which I have never seen in a thesis before! It tells the reader that the thesis can be read at various levels of detail, and is obviously written for the casual reader rather than the examiner - the sentence "Hopefully, after reading the introduction, you will have enough information to decide whether to read the rest" sounds to me like an invitation for the examiner to bounce the thesis! (Obviously, the examiners have to read the whole thing.)

Although I derecommend having such a section in your own thesis, it is nevertheless instructive about some features that a thesis ought to have:

Each chapter should have a summary, and that summary should be at the beginning of the chapter, so that the reader can gain an overall understanding of the chapter before they proceed to read it.

You should consider what are the main ideas you want an outside reader to take away from the thesis - what message it gives to people working in that domain.

You could try doing a flow diagram for the thesis, to denote the dependencies between the chapters.

Re: Chapter 0. Introduction

This looks very much like a second abstract! But it's too long for an abstract, and it contains extra stuff.

It gets a bit negative on page 2: the techniques do not apply to the broad range of programs, as the abstract led us to hope!

The introduction expands the last paragraph of the abstract, which contrasts proof of clean termination with proof of correctness. It establishes the scope - and non-scope - of the thesis.

On page 3, Sites uses an example (TREESORT3) to show that proof of clean termination is useful.

The limitations of the work described in the thesis are stated on page 4. It is important to be clear about what your artifact doesn't do as well as what it does! It's better to be up front about your shortcomings than to be `discovered' by the examiners, and it's also a sign of maturity as a researcher. So many journal articles aren't honest about limitations, which makes it more difficult for us to critically assess their work!

It's disappointing that Sites didn't manage a computer implementation of his techniques, despite the jazzy title of the thesis!

Don't overclaim, either - e.g. "This is better than Fermat's Last Theorem!" It's better to be modest than to proclaim loudly how wonderful you are and then fail to live up to your own hype (unless like me you really are brilliant, and then you're entitled to boast! :-)). Stating the limitations is also another way of scoping things out of your thesis project.

Week 5 (18th March 1998)

An Engineering PhD Thesis

Today, we looked at an `engineering' PhD thesis: "The Application of Visualisation Techniques to Three-Dimensional Semiconductor Device Simulation", which, according to a search of the JRULM catalogue, was written by Jonathan Cox in 1994. It was supervised by Hilary Kahn.

Even though, again, it's supposed to be the best thesis of its kind, we still managed to rip it to shreds! Only once a thesis has been written can it really be understood and criticised, including by the author! It's common for people who get a PhD to look back at their thesis a year later and think "that was a mistake!" (presumably of a particular aspect of it rather than the whole thing! ;-)).

The Title

It's important for the title of a thesis to grab attention, because more people will read the title than anything else in the thesis!

Cox's thesis title, "The Application of Visualisation Techniques to Three-Dimensional Semiconductor Device Simulation" seems to capture the two key concepts that I thought the thesis was about when we spent five minutes at the start of the seminar reading through the table of contents, the abstract and the prologue: device simulation and visualisation.

Note that a device in this context means anything that has semiconductors in it.

I'm finding it hard to come up with a good title for my MPhil thesis, which is about applying Case-Based Reasoning to Automatic Programming - in particular, to the transformation of constraints in information models for schema-to-schema mapping - I'm not sure how I can express that succinctly in a title!

The most arresting thesis title I've ever read is that of Paul Pun's PhD thesis: "KNOWLEDGE-BASED APPLICATIONS = KNOWLEDGE-BASE + MAPPINGS + APPLICATIONS".

The Table of Contents

In contrast to Sites's thesis of the previous week, this thesis is very long, with the main text taking 259 pages. What is particularly surprising is that there seems to be a lot of background indeed, with the first two chapters (75 pages) introducing device simulation and visualisation.

Remember that a table of contents should be understandable in and of itself - you should read it as if you knew nothing and see if it makes sense. Cox's Contents are a bit unclear: for example, what is AVS (Section 2.5) - is it a particular visualisation tool? (abbreviations should be largely avoided in the contents). It's not clear what system Chapter 3 ("System Overview") is an overview of, whether it's his own work or someone else's. And some people find it strange that there's a "Summary and Overview" of the thesis tucked away in Section 2.6 (although it's also in the Prologue). I don't find this so strange myself, because presumably it serves to re-orientate the reader after dredging through all that background!

The contents should give the reader an impression of what you have done. Cox's contents is not very good at that - it is obscured by all that background!

What's Needed?

A thesis needs to have a conclusion.

A thesis needs to have some evaluation, e.g. of whether the thousands of lines of code you wrote fall down or not. Bear in mind that the software you write is the means, not the end! It's important to leave time for evaluation when planning the thesis.

Nitty-gritty details (e.g. of the implementation) should be in appendices rather than the main text.

Cox's thesis is scientific engineering (building something to test a theory) rather than just engineering. This means that you have to test that it works, and why it works! It's dangerous to set out to engineer something that's new, because Computer Science is so rapidly advancing that three or four years down the road, it won't be new any more (e.g. the World Wide Web). If you do this, you have to save yourself by making it scientific!

Here are a couple of nasty questions that could get asked in the viva (oral exam):

Which aspect do you want to be examined on? For example, if your thesis work has a database aspect, a networking aspect and a user interface aspect, you might have to choose what it is to be examined from the point of view of. This is a crucial decision, and can mean the difference between passing and failing in some cases!

How do your results generalise? (In what way could they be generalised?)

The Abstract

Cox's abstract is, quite frankly, a ghastly piece of English. It's full of horrible sentences, and we went through it and picked them all out like English teachers!

For one thing, the abstract doesn't claim enough - it doesn't say anything about building a system, and nowhere does it look like he ran any experiments! It's not even clear from the abstract that there was a software implementation! There also seems to be no conclusion in the abstract.

Let's go through the abstract blow-by-blow:

> Semiconductor device simulation, by allowing the detailed derivation and analysis of
> device behaviour, provides a cost effective tool for the development and testing of
> prototype device structures.

You should begin a thesis with a good sentence, but this is a horrible one, because it uses a subclause to qualify the first noun, which badly delays the verb phrase!

Semiconductor device simulation does not provide a tool, it is a tool!

It is bad to say that the tool is "cost effective", because it's a claim that you can't back up in the thesis, plus this is not an application for an EPSRC grant!

> This thesis investigates the scope, merits and practicality of employing
> visualisation methodology to improve the usability and efficiency of the simulator.

It's not strictly correct to animate the thesis by making it the subject of the sentence - the thesis doesn't do the investigating, it just sits on a shelf! Also, "methodology" is the wrong word here - better "visualisation methods" or "visualisation techniques".

> There are two approaches, of differing scope, to the application of visualisation techniques.
> The narrowest application is to use data visualisation techniques to post-process simulation
> data in order to allow its visual interpretation and analysis.

It should be "the narrower application", because there are just two of them!

> This helps to satisfy requirements both for visual verification of the problem description
> and the analysis of the corresponding simulation results.

"For" and "and" are a bad combination here, as they makes it ambiguous as to whether it's "for visual verification of the problem description and for the analysis of the corresponding simulation" or "for visual verification of the problem description and visual verification of the analysis of the corresponding simulation results".

> The broadest application integrates visualisation techniques within the computational
> simulation environment.

The broader application does not integrate visualisation techniques - the author is the one doing the integrating - it uses visualisation techniques.

> The visual representation of data provides an intuitive visual interface to the simulation
> process which can hide underlying simulation management functionality.

Who's doing the hiding? (Okay, this is just nit-picking!) The moral is, be careful with which clauses.

> This enhances usability.

It is bad to use an anaphor, especially after such a long, nasty sentence before it, because it's unclear what it refers back to.

> This thesis first introduces device simulation and visualisation methodology. It then
> discusses the scope for applying visualisation techniques to this problem domain. Each of
> the two approaches identified is then discussed with reference to a formal model, and
> evaluated via a concrete practical implementation.

This is the last paragraph - it dishes out the dirt too late!

The examiner was obviously very generous not to fail the thesis on account of the abstract - if it was me, I'd at least refer it back for `minor corrections'!

Three golden rules for abstracts:

Start with a strong sentence.

Say what you did.

Include a conclusion. Say why you are doing this, what you have achieved, and claim your novel contribution (e.g. "have built the first ... in the world"). Basically, it's a statement of the form I built an X to do Y.

Next week Rizos Sakellariou, who gave that wonderful CS700 talk last semester, will give another one in CS710 on his thesis!

Week 6 (25th March 1998)

How (Rizos Sakellariou)

Click here for notes on Rizos Sakellariou's very brilliant CS700 talk.

Rizos Sakellariou's thesis was very mathematical in nature, therefore the following two books were appropriate to writing his thesis:

[1] Higham N.J. (1993). Handbook of Writing for the Mathematical Sciences. Society for Industrial & Applied Mathematics. ISBN 0898713145.

[2] Knuth D.E. (1989). Mathematical Writing. Mathematical Association of America. ISBN 088385063X.

General Tips

1. Write, rewrite, and keep rewriting!

2. Model the reader! Keep asking yourself: "How are they going to misunderstand this?" Include guidelines for the reader.

3. Master the medium and the material!

4. Simplify!

Lie if it helps!

Use simple examples.

Mathematicians don't like to state a theorem in 2 (or 3) dimensions if it can be generalised to n dimensions, but this is painful for the reader!

5. Aim for excellence!

Do try to give an example to represent every theorem. I'm reminded of the great Ian Pratt, who always starts with an example before explaining a difficult concept. Spend time looking for hidden analogies that will help the reader to understand (this is one of Rizos Sakellariou's strengths). Start with simple cases and examples before presenting the main results; present special cases of a theorem before presenting the general case of that theorem. For example, if a theorem in geometry applies to higher dimensions, start by restricting it to 2D geometry so that the reader can grasp it before having to try to visualise it in hyperspace!

Tips on Writing Style

1. Make it clear what you did and what others did, because the examiners need to see what your contribution is! Take care over the use of we versus I (my supervisor forbids the use of first and second person, which I personally believe cramps my style, but there's a tradeoff between academic writing and accessible writing!), use of the passive voice and use of quotations.

2. The opening section should be the best section! You need to make a good impression on the general reader (and, in particular, the examiners), particularly at the beginning! You should avoid deep technicalities in the first chapter, and predict what the gaps in the readers' knowledge will be (in general, you should assume they are well-versed in Computer Science but are not specialists in the particular area of your thesis).

You need to come up with a title for the thesis that conveys the essence of the thesis. The title of Rizos Sakellariou's PhD thesis was "ON THE QUEST FOR PERFECT LOAD BALANCE IN LOOP-BASED PARALLEL COMPUTATIONS." The key words here are perfect load balance, loop-based and parallel computations.

3. You shouldn't use the same notation for different things - it's confusing! A good tip is to include a summary of the notation used in the thesis, at the beginning of the thesis.

4. Don't start a sentence with a symbol! A sentence beginning "w is the workload..." looks awkward and is unacceptable. You should integrate equations into sentences properly (e.g. if an equation is at the end of the sentence, you should still finish the sentence with a full stop). An alternative is to use reference numbers, e.g.

x + y = z (EQ10)

and then refer to this equation as EQ10 in the text - this is especially useful for referring to equations that are several pages away.

5. Adopt a methodical numbering convention and stick to it consistently. References to chapters, sections, figures, tables, theorems and lemmata should be capitalised if accompanied by a number, e.g. Chapter 2, Section 2.4.1, Figure 2.5, Table 4.6 etc. Sections, figures, tables and equations should include the chapter number, and you may even choose to include section and subsection numbers: Equation 4.3.2.5 looks rather long-winded, but it tells the reader exactly where it is!

6. Remember to never split an infinitive! ;-) Monstrosities such as "to briefly discuss" and "has been also discussed" are quite common in English, but are unacceptable for academic writing.

7. Common mistakes (see the cited books for a proper treatment of these!):

Spelling occurrence as "occurence".

Getting mixed up between it's and its. The former is an abbreviation of it is, e.g. "it's the heart's filthy lesson", while the latter is used to denote possession, e.g. "A baby can have its own passport."

-is- versus -iz spelling. I can never remember which is which, but I think -is- is the correct English spelling (e.g. realise, normalisation) and -iz- is an Americanisation (or should that be Americanization? :-)). Probably more important than getting it `right' is to choose one or the other and stick to it consistently, although if you're quoting something, you shouldn't `correct' what it says, but you could put a footnote to say "so-and-so calls this such-and-such, but that's their term".

Be consistent with your use of the mod operator. For example, if I wrote a + b mod c, I would mean "a plus the remainder of b divided by c" (and not "the remainder of a plus b divided by c"). There are also other weird uses of mod, such as a + b = c (mod d) which, if I remember rightly from CS2022, means "add a to b, then take the remainder of the result divided by d).

The Viva (PhD Oral Exam)

There are three reasons why PhD candidates have to have a viva: it is so the examiners can see:

whether you understand what you did;

whether it is worth a PhD (i.e. is a contribution to knowledge);

whether it is your own work.

Back to CS710

Hosted by www.Geocities.ws