CS 460 - Software Engineering - UNM Spring 2014: February 2014

Wednesday, February 26, 2014

Client meeting reaction and project log (week of 2/23)

Our client meeting went MUCH better this time, or at least it felt that way to me. Since our last one, we did some soul searching and had some important conversations on design and implementation plans. This time we also had documents to back up our story - we had a more complete user story with a website mockup from Sonny, Alan provided a detailed description of the graph search we intend to use, and I provided a comparison summary of the database systems I had researched.

Going forward, we are trying to make some skeletons and prototypes. Alan is going to write his basic setup to perform greedy graph search on diagnostic charts with edge weights (as well as managing and updating those edge weights when solutions are reported by the user). David is going to stand up two web frameworks, Django (Python) and Play (Java/Scala), so he can compare them. Sonny is going to provide us a basic website like the one his mockup depicted. Finally, I am going to stand up a remote MongoDB database and work with Alan on a schema for it which will play nicely with his component of the system. The database will store diagnostic procedures and information about which causes from those procedures are the most likely (i.e. are reported most often).

I feel a bit better about the status of our project than I did last week.

Sunday, February 23, 2014

Some musings on databases and the NoSQL vs. SQL debate

So I've been doing some research into what kind of database this project will use for the knowledge base component. There may be a separate database for managing user accounts, which is a much simpler functionality. The short of it is that I'm thinking MongoDB is a good fit...

One of the main things I've been trying to decide is whether we want to do SQL or "NoSQL", and what are the real strengths and weaknesses of those two 'classes' (if you want to call them that). Note that NoSQL actually stands for "Not Only SQL", and some NoSQL databases have SQL-like query languages but use different representations and organization for data. What I'm generally finding (and have seen this in limited experience with systems such as MySQL) is that SQL databases, or more generally relational databases, are not all that good at encoding graphs nor hierarchical objects. This is a natural weakness from the explicit use of tables as the fundamental concept. Also, there's often a disconnect between how most object-oriented programs represent data and how that can and is represented in a relational database; this difference is sometimes called "impedance mismatch".

On the other hand, I'm liking what I find about NoSQL databases, and in particular document-oriented databases such as MongoDB. Instead of rigid tables and pre-defined schemas, a document database stores data in formatted files, which can be plain text but are often binary for efficiency. There is, of course, a syntax to these documents but there are generally not rigid requirements on what each object must or must not contain. This kind of flexibility allows similar objects to store things if they need them and omit them if they don't. I find this strength to be in direct contrast to the column concept in SQL-type database systems. You often see those types of database tables with some columns that are usually NULL - this is either because most records simply don't need or have that data, or designed into the schema in the beginning and remains because it's harder to remove it than to keep storing all those NULL's.

A key aspect of this decision is that we will have some complex data structures in the database. Per the work Alan Kuntz is doing, we will be storing graphs in this database representing diagnostic procedures with edge weights reflecting frequency of problem occurrence. I am deeply concerned that using a relational database for this purpose will cause us nothing but heartache and pain. By contrast, I believe a procedural step object (node) in a system like MongoDB could look something like this (pseudocode of course):

object node_step
{ "instructions": "check X" // what's displayed to the user
"id": "12345" // a key into this object
"connected": "12678,78912,56742" // connected nodes, could also point to Edge objects
"value": "0.45" // frequency of occurrence (based on repair history)
}

By contrast, I'm not all that sure how this would look in a relational database like MySQL. I guess we would have a table for node_step, with the columns shown above, and the field "connected" would be a set of numbers (there may be an inherent flexibility about the number of connections right there) that are foreign keys into other entries in node_step. To me, that document-oriented style just seems so much more suitable. MongoDB uses BSON, a binary encoding of JavaScript Object Notation (JSON). JSON is a language-independent data format that plays nicely with Java, Javascript, and many other languages. The format of JSON is basically like that node_step object above. I am feeling like the choice of this technology is a critical one because this knowledge base is essentially the core functionality of this application.

Friday, February 21, 2014

What it's like to have your project picked

It's both a great and a worrying feeling. Questions start rushing through your head. Do I actually know what I'm doing? Is my proposal doable? Is this team going to complete the job with me at the mast? I'm hoping the answer to all of these questions is YES. I feel grateful that I think I've been assigned a good team. I believe the guys in my group are solid. I won't let 'em down.

To Mechanapp!

Sunday, February 16, 2014

Final project selections

Well, I did it. I made my final project selections.

Thinking back on it, this was a pretty cool road that got us here. We all had to cook up an idea, define and refine it, and try to sell it to each other. I'm genuinely impressed at what some people came up with, and I'm excited (and a bit nervous) to see what gets picked and where I get assigned. I'll admit that most of my project picks were heavily influenced by the presentations. A notable exception to that was Automaton, a proposal for a computer science educational game by Luke Balaoro. His presentation was good, but not quite enough to make me note-to-self to go read his proposal. I ended up reading it anyway, and wow. Good stuff. I hereby give the 'best overall proposal' award to Luke. It was really well-written, gave a lot of solid detail, and seems like an interesting idea. I didn't rank it highest in my preferences only because I'm not sure how much I want to work on a game this semester (don't think it's really my forte).

Good luck to everyone on the vote next week.

Friday, February 14, 2014

Thoughts on pitches and effective 'project marketing'

So we wrapped up our project pitches today. I think mine was good, but there's that usual lingering feeling that I could have been better prepared and organized for it. I'm noticing something interesting as I go to narrow my preferences for project assignment down to 5: the in-person pitches I've heard over the last couple days are a surprisingly strong influence to that process. As I watched these presentations, I was noting projects I found interesting to further research and consider for my preference list. I've grazed through the relevant proposals, but found myself looking at ones I hadn't noted to pursue during class time. What I'm finding is that some of those 'edge' projects are actually quite interesting, and a couple of them are possible candidates in my list. It's kind of funny to note how much a two-minute speech can sway you for or against a project in comparison to a detailed 15-pager. I suppose the speech format is more effective at manipulating our emotions (i.e. "do I like you? do I believe you can lead this project"), while the actual proposal paper appeals more to our logic and reasoning (i.e. "he seems to have everything planned out well").

Sunday, February 9, 2014

Proposal Review for Brandon Lites

Review of proposal: "Ambient Algebra"

Proposal author: Brandon Lites

(blog: http://blitescs460.blogspot.com)

Reviewer: James Vickers (jvick3@unm.edu)

Proposal restatement

The proposal is to make a set of mini-games which teach college students algebra concepts when played. The project seeks to address high failure rates in college math courses and low proficiency of students. The games will be accessible online and the site will track user progress and provide facilities for leaderboards and achievements for players.

Reviewer reaction

As a former math tutor at CAPS, I know first-hand many of the problems this article discusses. Many students are not motivated to learn math early on. They actually can get quite interested if the topics are presented to them in more relevant ways. I think educational games are a good way to do this, if they can be made appealing enough for college-age students. I, like many others, have learned skills from games. I learned to type at a young age by playing educational typing games.

Quantitative scores

Format: 4

Overall, the format is good. I would consider trimming down the previous work section. Some of the information included there does not appear relevant to the proposal. The budget and timelines could be nicer (the budget should probably be in a spreadsheet or table rather than the way it's displayed).

Writing: 4

The writing style is clear and concise, but the paper needs a proofread and polish - some sentences are missing words or have the wrong word if you read them aloud.

Goals and tasks: 4

The timeline lists each member for 3.5 hours for the first two weeks, but at least 10.5 hours per person for each subsequent week. Sounds like a risky slow start to me. Otherwise, the timeline and its milestones seem reasonable. I like how the timeline has a min-max range for hours worked each week.

Scope: 3

Project is stated to be a supplement to mathematics instruction throughout the proposal. However, at one point it is stated that "Ambient Algebra is designed to replace a student's homework in which they solve problem after problem". I think this single statement may be a dangerous overreach of scope for this project. This would likely cause backlash from universities, and it may not be best for students to practice for exams and quizzes in a totally different format (game vs. on paper).

Plausibility: 5

Project appears perfectly feasible, and the author clearly identifies the technologies to be used. There is, of course, a serious challenge to be had in making a game both fun and educational. I think this may be amplified by the fact that the game is targeted for college-age students; I think marketing may be a key factor in getting these students to want to play games of this nature.

Novelty: 3

Early in the article you say that, of existing educational math games, there are "none in which learning algebra is the secondary motivation of playing the game" (page 2). Later, on page 4, you go on to say that "there are websites that offer games to teach algebra". As a reader, I read the first statement as a claim that no game websites for math education existed (which I was skeptical of). The second statement acknowledges the other games and explains the differences between them and your proposal. The main novelties of the idea are a different target audience (college-aged instead of grade-school aged) and the use of leaderboards and achievement tracking. It's not clear to me if the second novelty exists elsewhere already.

Stakeholder identification: 2

Students (the main users) are identified as the major stakeholder. The United States as a nation is sort of an implicit stakeholder in the article, through the discussion of its dismal test scores. I think more should be said about some other key stakeholders, namely universities (who may suggest the site for students or even make donations of time/money to it) and people or groups that sponsor students (such as scholarship foundations or parents).

Support and impact: 3

The project will charge a fee of $10 per semester for access. The budget section of the proposal claims that "With around 1400 students taking this course each semester, we can assume a revenue of $14,000 dollars per semester." I find this statement way too optimistic. You can hardly expect every kid in a math class to buy the correct version of the textbook and a calculator as it is. This claim also forgets that the problem it seeks to address (high failure rate of these classes) will also work to invalidate this projection - many students drop in the first 2-3 weeks from a lack of motivation or self-confidence. The pricing model suggested may or may not be appropriate, as similar educational game websites instead collect revenue from advertising and do not charge their users any fees.

Evidence: 4

Your motivation section (II) is SOLID. Giving stats on the failure rates of early algebra classes at UNM and the relative scores of the nations of the world really highlights the issue your project seeks to address. The budget could perhaps use a little more break-down and thought. For example, programmers are going to be paid $35 per hour (when the national average is more like $45), and the workstation for the project manager costs twice as much as for a team member (though it's not clear why that is).

Challenges and risks: 4

The main challenge discussed is making games that are both fun and educational. Another one mentioned is making sure the games are relevant to common areas of struggle for students. The only gave this section a 4 because I think another one exists that should be mentioned: convincing instructors to get over biases they may have about educational games so they may recommend this one for their students.

Wikipedia: software design patterns

First off, I find it interesting that the article quickly states that many software design patterns are object-oriented (and therefore involve explicit state), and are not very applicable to function programming paradigms. I'd like to read more about the design paradigms that functional languages are used with. It seems like object-oriented design is sort of at the heart of many (or most) of the design patterns listed in this article.

I think the fact that design patterns are not directly implementable (i.e. are not software specs) is both a great strength and a weakness. This trait means they are flexible and more abstract than actual software specs or prototype programs, but there may be some ambiguity in making an implementation of some patterns. The implementations may vary across languages and platforms to a degree that it can be brought into question if they still reflect the design they were trying to adhere to. But after all, I guess that previous statement is a general one about the common gap that exists between software design and implementation.

Some interesting design patterns:

Bridge: "decouple an abstraction from its implementation allowing the two to vary independently". To me, this sounds like a description of technologies like the Java Virtual Machine (JVM).

Iterator: "Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation". I love design patterns like this one. When well-made, iterators are a nice abstraction that allows you to loop through a set or list without thinking about where things are or how they are stored.

Lazy initialization: "Tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed..." I think most of the students in this class will think of Haskell when they hear this, except in that language it's called "lazy evaluation". Though I think it's mostly an efficiency thing, lazy initialization is also cool because it allows for some flexible data structures (such as 'infinite' lists in Haskell).

Proxy: "Provide a surrogate or placeholder for another object to control access to it". I think this is a very common design pattern. For instance, some large software systems have a kind of 'manager' module through which all reads and writes to a database must be handled.

Lock (parallelism): "One thread puts a 'lock' on a resource, preventing other threads from accessing or modifying it". This functionality is often critical in multi-threaded applications to prevent screwy behavior. The MPI library for C++ and Fortran offers many ways to 'lock' a resource for a thread.

Thursday, February 6, 2014

Review of reviews

So I got two reviews on my proposal, which was posted last Monday (2/3). I intend to make revisions based on the comments I received, but also from my own thoughts. I think reading a couple other proposals in the last week and hearing what some other people had to say about what makes one a winner or a loser has given me a better idea of where to go with it.

Some notes from the review by Ronald Shaw:

1.) Ronald said in a couple different ways that I should pick a platform and some of the key technologies (such as programming language) in the proposal. I think he's right and I intend to incorporate that into the next release. I think it will go along with what we talked about in class last week, "making nothing into something".
2.) He's also right that I did not fully identify my stakeholders. I mainly only described my customer base. I need to also think about and include the development team, automotive dealers, mechanic shops, and potentially negative stakeholders such as AllDATA (maker of some existing automotive repair software).

Some thoughts on the review by Kevin Dilts:

1.) Kevin mentioned this project proposal reminded him of some discussions from the CS 427 (Intro to AI) class last semester. That is about the time this idea started brewing in my mind, actually. We did a couple (rudimentary) examples of a theoretical logic system that could diagnose a car. Much later in the class, we discussed different types of AI expert systems, including case-based expert systems. I think this is how the idea got into my head that a hybrid expert system (case-based and rule-based) for car repair could be something new and potent. I was also pleasantly surprised to learn that such a system apparently does not exist at present.
2.) Kevin mentioned a potential challenge of the system not addressed by the proposal. He said there seems to be a risk that a system applying rules for diagnosis in a probabilistic manner (based on case histories, as this project proposes to do) has some risk of giving bad diagnoses. I do not believe this risk exists, but it illustrates that perhaps I didn't explain this part of the system very well. Imagine the system is leading the user through a diagnostic chart as if it were a tree, using some kind of graph search algorithm like Breadth-First Search (BFS). However, in this system, the search is directed towards the branches of the tree which are believed to be the most promising (i.e. are the most common causes of a particular problem on that vehicle or class of vehicles in the past). So, the search is traversing the most promising branches of the search space first (we hope). Even with the worst possible information guiding this process, the search algorithm is still complete despite its probabilistic nature (after all, it's built on top of BFS). I intend to simplify and detail this description and add it to the proposal so it may be more easily understood.

Wednesday, February 5, 2014

Thoughts on the "Software craftsmanship" movement

On the face of them, the ideas associated with this "software craftsmanship" movement seem beautiful, even noble. It basically challenges some assumptions about the making of software, namely that it is strictly a kind of methodical and predictable process. It makes a case that software engineering is as much a craft as it is a science or engineering discipline. I like these ideas, I think. One thing I'm not sure about is some of the ways that they phrase them.
From the "Manifesto for Software Craftsmanship" (http://manifesto.softwarecraftsmanship.org/):

"Not only working software, but also well-crafted software"

It's kind of unclear what this means. I think it's supposed to mean all the things we as software people like in code design, such as reusability, readability, and robustness. But putting it this way, in my opinion, is too vague. It leaves it open to a lot of interpretation, and might be called "soft" or "washy" by outsiders.

"Not only customer collaborations, but also productive partnerships"

I don't really get this one either. Isn't a 'customer collaboration' a kind of productive partnership? Do all transactions in the software world have to be in the form of some kind of partnership? I don't see anything particularly wrong with contracted work with a company that doesn't care about your company. This statement in the manifesto seems to suggest every company you make software for should have a personal interest in your company. If so, is that reasonable or even desirable?

I do agree with this movement that the coding skills of the developers themselves are important, but I don't agree that they're the most important. To claim that's the most important part of a process as complex as software engineering is to close your eyes to so much. The best coders the world over could be led into nowhere by clueless project managers or a marketing team not bringing in paying customers. To claim that the quality of produced code is the most important aspect of a project is somehow noble, but appears misguided. The quality of the code really makes little difference on what a system does; it makes a bigger difference on what the system can be made to do easily (i.e. modularity).

My ideal project team member

The three qualities each member of my team will have, ideally:

1.) Reliable

This is critical. I need someone who live by their word so I can count on them. In a small team project like this one, the failure of one member could bring it all crashing down. Without reliability, a person may as well be nothing at all.

2.) Adaptable

The only thing constant is change. Ideally, the developer can quickly change his actions and thinking to cope with rapid change. He should be able to think on their feet as conditions evolve or the worst-case scenario unfolds.

3.) Dedicated

The person has to have a strongest desire to succeed with this project. This trait means that this person strives for excellence in everything he does, even those he dislikes or thinks he's weak at. It also means he does not fold when things get a little dicey.

Three other qualities not in the top three, but which are also highly desirable:

4.) Creative

In the view of many, software engineering is as much an art as a science. Often, the best solutions come from the more creative among the group. Creativity in this context enables beautiful and novel solutions that a logical mind simply doesn't make on its own.

5.) Honest

This quality is important for the group to trust each other and communicate effectively. If a team member feels there is a problem looming, they should say so even if it may offend another member (who, perhaps, is the cause). An honest team member gives their true opinion, regardless of the politics of the group.

6.) Personable

This trait simply means the person is easy to get along with and can effectively communicate with others. It almost warrants a slot in the top 3, but is not required for an awesome team member. A likeable developer is a big plus when talking to customers or with the team.

Tuesday, February 4, 2014

Review of "WorldBand" proposal by Ronald Shaw

Review of proposal: "WorldBand"

Proposal author: Ronald Shaw

(blog: http://rbshaw5.blogspot.com/

Proposal reviewer: James Vickers (jvick3@unm.edu)

Proposal restatement

The proposal is the make a social web site for collaboration between bands. For instance, a user could upload a track or sample from a simple instrument and other users can do the same (but likely for other instruments), by which music can be made from distinct pieces written by different people.

Reviewer reaction

The project is novel and interesting. I'm usually not big on social media ideas, but this one strikes me as cool on the surface. I think much more attention needs to be paid to the music collaboration tool itself. I think perhaps the proposal writer is withholding details on this aspect on purpose, which may or may not be wise (we need a taste of it at least).

Quantitative scores

Format:

Good layout. I especially like the "Context of work" diagram (section 4b), which shows basic transactions to take place in the project's ecosystem. The in-depth timeline section is detailed, but at a cost of added length. It could possibly be put into some kind of calendar format. I also think the "Work partitioning table" (section 4c) could probably use a column for the actors involved (i.e. user and site, advertiser and site, etc.).

Writing: 5

No complaints. Style is clear and simple to read.

Goals and tasks: 3

The goals of the web site interface are well-defined, but those of the music collaboration tool it hosts need to be expanded. It's a lot of the novelty of the project, and we need to know how users from across the world will be able to work with each other without getting frustrated.

Scope: 5

The project is meant to be a web site for people to upload, download (for a fee), and collaboratively create new music by combining tracks or samples. Little ambiguity to be had.

Plausibility: 4

The product seems plausible overall. One possible difficulty is managing the collaboration between users on a single music file to try and prevent themselves from clobbering each other's work.

Novelty: 5

This idea seems so good that I'm still trying to figure out if it already exists. It's stated that GarageBand (by Apple) does not have a functionality to collaborate on music online, which I find surprising (not that I've used GarageBand, I just thought that was part of its purpose). If such a site/tool doesn't exist, it seems like it should come into existence.

Stakeholder identification: 4

Stakeholders are listed neatly. I do however think some are missing, related to the possibility of plagiarized music being sold on the site. In that case, people such as the RIAA (Recording Industry Artists of America) or government agencies could become negative stakeholders.

Support and impact:

The project has convincing impact, in that it could be used to help people from all over the world make music together. It's one of those ideas we only dream of in the internet age. The more users the site has, the better it gets; this growth model is a double-edged sword by which some sites like YouTube become huge and others die in the night without a sound.

Evidence: 4

Your budget is nice and detailed, but I notice that week 2 is budgeted well above the 75 hours you said you had available for each week. I find that you have done your research on web design and the related technologies, as well as competitors products and the features they lack for your product to fill. The "Context of work" (section 4b) diagram is a nice summary view of the project's scope and function. There is room for improvement in describing the type of interface envisioned to allow collaborative music writing, the crux of the proposal.

Challenges and risks: 3

I think there is an important legal and ethical risk missing from the proposal. The site is paying people who create parts of music tracks when they are downloaded by users. What isn't mentioned is the distinct possibility that some of those samples or tracks are already plagiarized. In that case, this site will be paying the wrong people for music that neither the site nor the person who uploaded it own. I think the other challenges and risks are addressed rigorously.