Goes Nowhere, Does Nothing: Programming

Showing posts with label Programming. Show all posts

Sunday, February 9, 2014

Goes Somewhere! Does Something!

Well, it's been almost a year.

The point of keeping this blog was to use it as a teaching tool; when I teach, I also learn, and it reinforces what I've learned. Basically, I was hoping to simultaneously provide a service, as well as bolster my own programming and engineering skills.

Unfortunately, when I run across ideas I really really want to blog about, I tend to do due research, to make sure I'm getting things right. On the upside, this research often reinforces my knowledge and tends to validate what I'm about to teach. Which is nice. On the downside, in the course of my research, I invariably find that someone has already said what I wanted to say, but better. Or they've done something neat I was thinking about doing, but they've already gone and -done- it while I'm just thinking about how to get it in a neat blog format.

So, now that nobody reads this blog anymore, I feel it's safe to shift gears. I'm still going to go through the motions of making my usual computer science and engineering related blogs, but from now on, I'm going to make a brief blog post that'll be loaded with links to these other resources I've been finding. That way everyone can benefit from my infovore tendencies. You're welcome.

However, that doesn't really make for a steady stream of new blog updates. So, now, I'm going to shift the gears of the blog ( again ). This time I'm going to try to post at least once a week on my progress in world-building a science fiction universe I've thought about on again and off again for over a decade now. Nothing will ever come of it, but it'll be nice to have all the information on my little universe in one easy to search place.

The format will be simple. I'm going to spend a few weeks getting the bare bones framework of the universe built, and talk about stuff in sci-fi that I like to talk about. Once I've built up some steam, I plan on throwing in short stories that take place in this universe.

I make no apologies in advance. I'm not a writer by trade, I'm a programmer. Which is like writing, but you're not telling stories, you're telling a very literal-minded temperamental machine what to do. So I guess it's more like being a parent. Which is not like being a writer.

Wednesday, April 24, 2013

Give Scotty his Due

People are bad at making long-term planning estimates.

People are -really, really- bad at making long-term planning estimates.

And just to be clear, I am definitely included in the set of 'people'.

And when I say 'long-term planning estimate', I am specifically talking about the estimate of how long a complex task will take. And by that, I don't mean something like writing a 3-2 tree or re-writing malloc. Those are complex tasks, sure, but there's lots of help available for them, and we already have metrics for similar things.

I'm talking about the kind of long-term estimate that we are so often asked to make in industry. How long will this big semi-specified project take to finish? Or even how long will this fully specified project take to finish?

How often do you hear about a project releasing on time, on budget?

To be fair, there is a certain amount of confirmation bias going on. We don't hear about the on-time on-budget projects as much because they're deemed unremarkable and not news worthy. That's the way things are -supposed- to work, and when things work as designed, there's very little fanfare.

However, any number of us are able to think of projects that have gone massively overtime and overbudgets. Promised big, delivered late.

I'm looking at you, Half-Life 2 Episode 3. Or whatever it is you're going to be called when you're finally released. And for software which is known to be not vaporware, Duke Nukem Forever, which was in development longer than the moon program was.

Granted, these are two extreme cases, and in the case of Duke Nukem Forever the developers seemed firmly intent on shooting themselves in the foot, but we all know of dozens of other properly managed software projects that have both gone over time, and over budget.

What's going on? Why are we so bad at making estimates?

I think we're consistently answering the wrong question.

To get a correct estimate would require a rather large investment of effort. We'd need to research how other, similar projects have fared. We'd need to look deep into the domain in which we're developing, because software often requires you to have two sets of knowledge: Knowledge of programming, which we have, and knowledge of the field we're programming for, which we need to learn on the fly. For example, it'd be bad form for me to try to program even something as simple as a restaurant POS system without having some idea of what their needs are and what their use cases look like. So now I need to factor in research and learning time. Then we need to determine how long testing will take, which again should be done not by just sitting and thinking about it briefly, but instead by comparing how long testing has taken for similar projects.

That's a lot of work. And that's before we get to all to common problems like funding hiccoughs, software and hardware issues, or spec changes.

You know what's easier? Determining about how confident I feel that I can accomplish this task, and multiplying it by how complex I think the task is. Then making an estimate based on -that-.

I think that's the question we're actually answering when we give a time estimate. Not how long we think it'll take - that question is hard, and very possibly can't be answered correctly. But I absolutely know how confident I feel and how complex this looks.

So what's the fix?

Not making estimates might be a good start. But management hates that, and that's perfectly reasonable. The business world is not made on good will and best wishes. Doing all the research necessary to make a good estimate is a nice second. But the company might not wish to pay for the effort required there, and if you're doing something truly novel, there may not be any projects similar enough to compare against. Also, unquestionably, we all think we can do better than our peers, statistics be damned.

i just give Scotty his Due.

Scotty, of course, is the engineer of the USS Enterprise, who rather famously would multiply his time estimates for how long something would take to get done. He knew Kirk would always tell him he had half that time to do it. So he'd give estimates -four times- larger than what he actually thought it'd take him. If he got it done on time, well, good, no sweat. However, when the captain ( manager for us ) cut him back, he still had slush room. And he -still- had slush room after that. The captain chopped by half, he over-estimated by four, so he still had double the amount of time he thought he'd need. And of course, being a dramatic entity on a hit sci-fi show, he always got things done either just-in-time for dramatic effect, or under time, and could be called a 'miracle worker'.

We're not Scotty, but his methods just might work for us. Take your estimate. Add an order of magnitude. Then multiply that by four. Give Scotty his due. Maybe that estimate will actually shake out.

Wednesday, April 17, 2013

Compiled versus interpreted

Some languages ( C, C++ ) are compiled; others ( Python, Common Lisp ) are interpreted.

What's the difference?

It's worth noting that any language can be compiled -or- interpreted. All the interpreted languages I'm familiar with also have compilers. I've heard that there is a C interpreter out there, but that making a sane C++ interpreter is difficult, possibly impossible. Still, there -are- interpreters out there for what are traditionally compiled languages.

Also worth noting up front is that the end result is the same, either way. Your code-as-written will be broken down, parsed, turned into machine code, and then run by the computer's CPU.

So, back to the question. What's the difference?

A compiling language will produce an executable by default. The compiler will come along, analyze your code, make a symbol table where it will keep track of names of things, datatypes, and so on, and then it will create a whole bunch of machine code. This file that's produced can then be run by the OS later. Since your code was compiled, the symbol table is really easy for the computer to keep track of - when it runs across a symbol, for example, a function or variable name, it knows where to look for it.

For an interpreted program, life is a little more difficult. A lot of things that are handled by the symbol table now have to be handled on-the-fly by the interpreter. On the other hand, interpreted languages can use dynamic typing, which is a boon; dynamic scoping; and tend towards being more platform independent ( Java, for example, compiles to bytecode, which is then interpreted by the Java virtual machine, and is famous for its 'write once, run anywhere' manifesto ). They also have their downsides; the main disadvantage is that they have a bit more overhead than a compiled language. With a compiled language, the grunt work of turning your code-as-written into machine code is done at compile time, and then the compiler exits. It's no longer in memory space, taking up resources. An interpreted language, however, needs its interpreter. Otherwise, there's no machine code, and nothing gets done.

So, long-story-short, the interpreted languages need an interpreter. The compiled languages get compiled, but don't need the compiler once they've been made into an executable. Either way, your code is still getting turned into machine instructions for the CPU, there's just a difference in how those instructions get there.

Tuesday, March 5, 2013

My Workspace

I was asked by someone about my workspace. This is kind of an opinion piece. My way is certainly not the One True Way to set up a coding environment, and shouldn't be taken as such. Hell, even for -me- it's not a One True Way. It's possibly not even a -Better- way. It's just a way that lets me get things done without wanting to punch the machine.

So, first, platform. If I'm on a Linux box, my work gets done at the command line, all the time every time. I still prefer to have a GUI with a desktop and graphical applications so I can Google for something real fast or distract myself for a few hours with that neat marble game, but actual coding work will be done at the terminal. I'm fond of Ubuntu's tabbed-capable terminal. I'll almost always make that sucker super big, make a tab, get one tab to where I want to be in the file system and the other one will be used to run my current editor of choice. For most languages, I'll use vim. If I'm coding in Common Lisp, I'll use emacs, simply because the emacs/SLIME combination is, for my purposes, extremely hard to beat.

On Mac, I use a hybrid approach. I'll have the terminal open to where I suspect I'm going to be doing a lot of compiling and running of code. For actual editing, well, this really depends on my mood. If I just need to make a small, fast edit, I'll use vim from the command line. If I'm working with Common Lisp at all, again, emacs is the tool of choice. If I'm doing major work, Sublime Text 2 has been amazing. I don't even fully utilize all of the stuff it can do. I just really like its slick appearance and some of its capabilities, like highlighting a whole bunch of the same keyword all at once and then changing them all at once, and regexp search, and a few other things it does well.

On Windows, well, I don't typically code on Windows. When I do, I usually struggle along with Visual Studio. I am not saying anything bad about Visual Studio, in fact, it seems to be a really slick tool that would work fantastically if I'd just sit down and take the time to get really familiar with it. I don't, so I find it slightly annoying to work with.

You may have noticed that unless I'm on Windows, I'm not using a dedicated IDE. Yes, I know vim can be made to be very much its own IDE, but I haven't done that yet, so it doesn't count. It's not from a lack of trying. I certainly find myself in Visual Studio often enough, and I tried Eclipse for a few months, as well as XCode. The problem with all these tools is that I don't really want to learn a new tool. I get frustrated by starting a new project in an IDE. Also, Eclipse was really, really slow on my Mac for reasons I didn't investigate.

Also, part of me absolutely loves only relying on tools which are almost universally accessible. Sit me down in front of any given Mac or Linux box, and I can code, without feeling frustrated about my favorite tool not being available. That's -neat-.

It's worth noting that almost all of my work to date involves either command-line apps, or a handful of OpenGL programs using SDL. If I was to do more work which required frameworks, or which had to talk to a specific platform's GUI API, I would probably pretty quickly adapt to using an IDE. In particular, I've played with QT Creator in the past, and I really liked that.

What are the benefits to my workflow, if any? Well, it works for me, and I can work pretty fast this way. Which should really be the goal of any work environment. If it lets you do the work you need to do, it's a good work environment.

Wednesday, February 27, 2013

Dates on a computer

Dates are one of those things in Computer Science that keep finding new and more entertaining ways to behave really badly. So here. Read this XKCD comic, and then everyone, please just follow the ISO standard. Hell, if everyone follows the ISO standard, everyone can keep doing that annoying thing where they store a date in an integer. That'll work okay until well after my lifetime is over. Note that other habits for storing time as a single integer don't sort as well ( the ISO standard, take out the dashes, sorts perfectly fine using the standard integer comparators ), or they have edge cases that'll behave oddly.

For example, if you stored dates as month-day-year, and you need to store January first, 2001, let's first convert that to its numerical equivalent in month-day-year: 01-01-2001. To get your integer, remove the dashes.

That leading zero gets lost. If you're using -three- ints to store the date ( a distressingly common thing I see ), -two- leading zeroes get lost in conversion to a single integer, and we wind up with 112001. Which I don't know what that is when your custom date format object gets passed to my code.

What I'm asking for, is if you're going to be sloppy about your date formats, store them in a single int, in the ISO format.

Though what I -really- want is for your date formats to actually be a robust first-class object in your system, but I understand that's a pain to code for.

If you decided to store dates as the number of seconds on your system, well, okay, that's fine, I can work with that. Please don't ignore things like the 2038 problem. ( Hint: If you're going to count seconds, use a bigger integer )

Wait, what's the 2038 problem?

Okay. Well, assuming a single 32 bit block of data is storing time information, and assuming you're working on an architecture that assumes the beginning of time is 1970-01-01 at 00:00:00 UTC, ( so, pretty much all Unix-based systems ), AND you're storing time as the number of seconds since this beginning of time, the last date that can be recorded correctly is 2038-01-19, at 03:14:07 UTC. At that point, the integer will overflow, and it'll be 1970 again for everyone.

Note to actual professionals reading this blog: I'm still a college student. These are problems I deal with. Somebody tell me better in the comments.

Trivia point: if we're counting the number of years since 'the beginning of time', last year was year 42. I hope everyone remembered their towels.

Tuesday, February 12, 2013

Technical Debt

This post is not about money.

Any serious programmer who has been programming for any reasonable period of time, whether as a hobbyist amateur or making big money at the big company, will eventually have to make this decision:

Accomplish your task doing something dirty -now- and fix it later, or do it the correct way ( whatever your idea of the correct way may be ), taking longer, possibly even taking up time that is unavailable. And in almost every situation, the temptation to do it dirty and do it now is overwhelming.

Doing it dirty and doing it now has the advantage of instant gratification. You can see the results of your work sooner, get immediate benefits, and spend time on something more interesting or less vexing. Of course, you’ll fix it proper later. Maybe during a maintenance cycle or during code review or in the next patch or when you revisit this program. Whenever it is, it is usually in the indeterminate and unplanned future but we as programmers are -certain- we’ll fix it.

Of course, we never fix it. We’ll forget how we programmed our quick and dirty fix in a week, forget the problem domain in a month, and just plain forget the whole program in a year. Either we will move on to another project, or the demands of this project will always stay high enough that we are never able to quite get back to fixing our quick and dirty code.

And in a high entropy environment, one quick and dirty fix becomes two becomes many. And we never get back to create those proper solutions we’ve always dreamed of. This is the technical debt. Taking out a loan in time now, and then never paying it back. The debt is incurred typically in code understanding or maintenance, and the quick and dirty fix can wind up costing us far more time than creating the proper solution in the first place.

It’s important to try to remember this. The technical debt for most projects will be incurred, sooner or later. It’s hard to keep in mind during crunch time or during finals or when you -just want to draw a box on the screen, dammit-, but that technical debt will tend to come back. Pay it now, and pay it forward. Don’t fall into the trap.

As a final note, though, moderation must be exercised here. While a quick and dirty fix now is almost never worth the end cost, a good solution now is often better than a perfect diamond solution, depending on what you are working on. Good judgement that comes with experience will help a lot here, but given the choice, I would recommend most programmers try for perfection rather than settle for quick and dirty.

Wednesday, February 6, 2013

Programming Paradigms

This is a placeholder for some terminology.

There are some ‘big’ programming paradigms that I’m familiar with. Procedural, functional, and object oriented are ones I have hands-on experience with. There’s also imperative versus declarative, and the idea of reflection, and macros, and... well. Programming languages tend to support one or more of these paradigms, and that, in turn, will affect the way in which a coder will code. Since production languages tend to be Turing complete, you can technically force any language into any paradigm. Having said that, programming languages are tools, and just like real tools, while you can force them into tasks they weren’t designed for, it’s probably better to just change tools.

Real fast, for those who aren’t programmers, Turing completeness describes a machine that can perform any calculation. You can imagine a machine that can read symbols that are printed on an infinite ribbon, and can interpret those symbols to carry out an arbitrary number of instructions, as well as modify those symbols. For practical purposes, if a programming language has an IF construct, and can read from, write to, and modify memory, it’s Turing complete.

IMPERATIVE programs are programs that can be described as a series of orders. Imperatives, if you will. PRINT a number, GOTO a line number, do this, do that. You can think of it as marching orders for the machine. Procedural and object oriented programming tends to fall in this category.

DECLARATIVE programs just describe what the program should do, but not a thing about how it should go about doing it. Functional and logical programming languages tend to enter this paradigm, as well as database languages such as SQL.

PROCEDURAL programs are the ones I am most familiar with. They have a list of commands to be followed, and procedures which can be called. C and BASIC are procedural programs, with the procedures that can be called being known as ‘functions’ in C, and ‘subroutines’ in BASIC. I think BASIC also has functions, but it’s been so long since I used it, I don’t really recall.

FUNCTIONAL programs are called that because they can be thought of as running a series of mathematically provable functions as programs. Haskell is an example here. They are noted for their lack of side effects, which is a way of saying that they don’t modify states like procedural/imperative languages do.

OBJECT ORIENTED programs have, well, objects. These objects tend to have a number of characteristics, such as encapsulation, and message passing. An object will tend to have a public interface, which is a way of passing commands to the object. This object will then have an implementation, which code outside the object doesn’t need to concern itself with. Objects often can also pass messages to each other, and act on these messages. C++ and Java are object oriented languages.

Hopefully this is clearer than the relevant Wiki pages.

Tuesday, January 22, 2013

Mental State Saving

I was thinking about why I could go back and easily get back into a saved Bejeweled game, but not other, more complicated games, such as Arkham City or Deus Ex. It’s certainly not a question of favoritism. If you asked me which game I liked more, the answer will be Deus Ex, by a considerable margin. But what is it, then? I find it more difficult to load up a save of Deus Ex than it is to pick up my phone and fire up Bejeweled. There are many possible answers, but I think one of the more interesting ones to explore is hedged in complexity and mental state.

If we were to ask which game is more complicated, certainly, games like Deus Ex win by a landslide. They have conversation arcs, location maps, branching decision trees and story paths, inventory systems, and so on. Even a very simple single player FPS such as Serious Sam will have things such as which weapons you have, which map you’re on, and what stuff you have/have not picked up and secrets you have/have not found yet. Bejeweled, on the other hand, has very few simple rules, and while they may change from game type to game type, overall the playing field is very homogenous.

Similarly, the rules are different. For Bejeweled, the entire ruleset can be and in fact must be kept in the player’s head if they are to be successful. There are only a few ways to move gems on the field, and only a few ways to make successful combinations. Contrast this with a FPS where the minimum information you need is where you’re at, where the enemy is at, what weapons you have, and how much ammo.

I think the difference in picking up the game again, particularly long after the last time I’ve played it, comes down to the idea of mental state. Can the player hold the entire state of the game in their head, and do they need to? For simpler games, the answer is really no. I can pick up my saved Bejeweled game at any point and immediately get into it. The state is revealed in full, immediately, when I load the playing field, and the rules are simple enough that even if I’ve temporarily forgotten them some experimentation will quickly reveal them to me again. Contrast this with many other game types, where even the loading screen tips and reminders may not be enough. What objectives have I accomplished? Which side objectives do I need to pick up or already have? What’s in my inventory? Who did I talk to last, what’s my character’s relationship state?

Steps can be taken to mitigate the problem of restoring the player’s mental state, like the aforementioned loading tips and reminders, and of course many games implement something like a log book to help you keep track of what you’ve done in this save file. However, they tend to be necessarily incomplete; they do not contain all of the details, and they certainly cannot be expected to keep track of certain personal goals a player may have set for themselves ( like try to get a certain weapon early because you knew where it might be stored ).

However, in the end, the simply PopCap style games can be said to be devoid of player state. The player needs to bring nothing to the game besides a desire to play, and no matter where they were last or what they were doing, they can quickly and easily get back into gameplay. For me, this low barrier makes them often more strangely more alluring than trying to run Arkham City again and try to remember everything I feel I need to know as well as what button throws the freeze grenade if I even have it. Of course, I still like the big deeper games better. The depth is welcome, the storylines intriguing, and a good game can get me to explore myself a bit.

But it’s still interesting to think about, both as a player ( why do I have more hours logged in on my iPad than my PS3? ) and as a budding game designer. I’ll need to keep these things in mind as I make large and complicated games. Saving and loading the state of the game, that’s easy; getting a player back into a game after months of them not touching it and trying to restore their mental state sufficiently to a point that they’re not repeating tasks or getting frustrated, that’s hard. Difficult, but definitely worthwhile to try and overcome.

Tuesday, January 15, 2013

Recursion

Recursion is an idea that seems very hard to ‘get’. Thinking about it doesn’t come naturally to most, myself included. While I can’t help you get to an ‘ah-ha!’ moment, I do have a mental framework for how to very quickly build a certain class of recursive function.

First, what’s recursion? It’s anything that repeats itself in a self-similar way. For programming, this can be generalized to the idea of it being a function that calls itself repeatedly to do its task. A simple C++ example for calculating a factorial is this:

int factorial ( int x ) {
if ( x == 1 ) return 1;
return ( x * ( factorial( x - 1) );
}

You’ll notice in the return statement, the factorial function calls itself again; that is a really simple example of recursion. This calculation could also be done iteratively, that is to say, using a loop or goto instead. I could also recode it to use a specialized variant of recursion known as tail recursion, which can be more efficient, but for now, I’m going to stick to keeping it simple.

So, the first step that should be asked when making a recursive function, is if recursion is really necessary? While recursion can make for some neatly compact code and looks very clever, there are trade-offs associated. One of them is, in fact, cleverness. Coding to be clever can often backfire when you or somebody else has to go back and try to understand what the hell it was you were trying to do. Another tradeoff can be in performance. When a function calls itself, it is making a new copy of itself to do so. Every new recursive call is putting more and more data on the stack. A simple iteration loop avoids this problem.

So, we’ve decided to plow on ahead anyway, and make a recursive function. My first step is to start with the final step, or the simplest step of the recursion. What is the very smallest version of the problem I am dealing with?

For the factorial problem, it’s the factorial of 1. You don’t need to even do any multiplication for that; just return 1. For making a list or a tree, it’s adding a node to the empty list or tree. For the fibonacci sequence, I actually have two ‘simplest’ conditions, the first two numbers in the sequence, 1 and 1. So I start with that.

Now, I go to the next most complicated step, and think about what’s needed. For the factorial, I now need to do some multiplication. What’s factorial of 2? That’s 2 * 1 = 2, or the more complicated case ( 2 ) multiplied by my ‘end’ simplest case of 1. You can see how this is accomplished in the code above. For making a list or tree, well, I need to traverse the list. So I check that the next node in line, using whatever sorting method ( or none at all ) I fancy. If the next node is null, or the next node is the one -just before- where I need to do an insertion, do the insertion; otherwise, call the function again, but this time with the address of the next node in line. For the fibonacci sequence, it’s a little more complicated, but not much. Since the fibonacci sequence is the sum of the two previous numbers in the sequence, I just need to make my return statement something like return ( fib( n-2 ) + ( fib n-1 ) );

And that’s it. Make sure my code does what I want it to, and call it done. It’s really that simple. The hardest code is for the list, and that’s because you’ll need to make sure you’re linking the nodes correctly. Otherwise, the algorithm for a linked list is, largely, ‘if null, make a node; if not, call this function again with the address of the next node’. Sooner or later, one of the function calls will hit the null node, and then it’ll just make a node there.

So there you have it. I hope this was helpful.

Tuesday, January 8, 2013

C++ Compiling

I’ve found that the college doesn’t really do a good job of describing what, exactly, is happening when you hit ‘enter’ after the “g++ filename.cpp” command has been typed in. This is a general post I hope to be able to point people at in the future, and will be a -very- broad overview of what is happening when a file gets compiled. I’m not going to talk about tokens or machine code or any of that; I’m going to cover, quickly, the most common preprocessor directives, and broadly what it is the linker does.

First, preprocessor directives. If it begins with a #, it’s a preprocessor directive. These directives are not C++ code; they’re instructions to the preprocessor, and as such, they are executed before actual proper compilation begins. The #include statement is essentially a copy and paste operation. The file named in the #include statement will have its entire contents copied, and then pasted into the file at the location of the #include. If you use #include “file”, with the double quotes, the preprocessor will start its search for the file in the current directory. If instead you use #include , the preprocessor will start looking at some location defined by your compiler, typically where your standard header files are at.

Another useful preprocessor directive is the #ifndef, #define, and #endif. The first one can be read as ‘if not defined’. This should be at the top of every one of your header files you make, and should be immediately followed by a #define statement. What this is telling the preprocessor is ‘if this hasn’t been defined yet, define it now’. Normally this definition will have a name similar to the name of the header file. So, for my header, nonsense.h, the full preprocessor directive should look like this:

#ifndef NONSENSE_H
#define NONSENSE_H

At the end of all the code in my header, I will put in a #endif. What this does is it makes sure I don’t accidentally try to compile the same header file more than once, even if it’s called by multiple files. So if I have main.cpp, nonsense.cpp, and whatever.cpp all with a #include “nonsense.h”, nonsense.h will still only be defined once. This is good, because nonsense.h should, as a header file, also have all my declarations in it, and C++ will get cranky ( read: not compile ) if it finds multiple declarations of the same thing.

There is another common use for #define. It is often used to create constants, for example #define PI 3.14. It’s worth noting that it’s not making an actual constant, like const int pi = 3.14. All it’s doing is forcing a substitution. When the preprocessor hits this particular #define, it will go through your code, and everywhere it sees PI, it replaces it with 3.14 instead. Remember, the preprocessor does not know C++. Be careful when doing this to not treat PI ( or whatever your #define is ) like a variable.

There are other uses for the preprocessor directives, but those are the most common ones. Now onto the linker. The linker is a promise keeper, of sorts. It checks the promises that you as a programmer have made, and makes sure that you’ve kept those promises in code. The promises you’ve -made- are your declarations. Keeping those promises happens in your definitions. A function declaration, for example, is this:

int factorial( int n );

That’s the promise. By making this declaration, you are promising the linker that later on, you will have a definition. Your definition might be something like this:

int factorial(int n ) {
if ( n == 1 ) return 1;
return ( n * factorial ( n - 1 ));
}

And that’s a promise kept. The linker also takes all the .o files generated during compilation and links them together into one executable file. Usually this will be static linking, where all the code actually exists in one executable file. However, you can also run across dynamic linking. Dynamic linking is complicated, and all I’ll say about it here is that when you’re using dynamic linking, the files will -not- all be compiled into a single executable file. Instead, there will be the executable file, and it will need some external code in order to run properly, usually in the form of DLLs ( dynamically linked libraries ) or SO ( shared object ) files.

Tuesday, January 1, 2013

Blog Revamp

I've noticed that a fair number of the blogs that I read on a regular basis tend to be focused in their subject matter. They're a combination of personal blog and work blog. Personal enough to make connections with readers and really reveal a little bit about the person writing. About work enough to have a tightly scoped subject matter, and to really reveal some interesting things about subjects I'm interested in. Also, since the people who bother blogging are usually experts in their field, I can learn new things from them. When those people are in -my- field, I often learn interesting new things and new ways to think about problems I'm facing.

As a result, I'm revamping my blog to 'fit in'. What I've mentioned above is only one way to make a 'good blog', but it's the way I'm going to follow. So instead of the previous model I was using, which was to treat the blog really very much like a personal journal, I'm going to narrow this blog's focus. My posts from here on out will primarily be concerned with my work in computer science. I'll periodically make an entry regarding my nuclear field work, since I feel everyone should know more about atomic processes, and maybe a bit about neurology or life lessons I've learned. And if I can't think of something to put up a particular week, well, I still have the recipe fallback.

Anyway. This is my little corner of the internet. I really don't expect anyone to read it, ever, but if someone does, I hope it either spurs conversation or they otherwise find it useful.

Monday, July 23, 2012

Life on Mac Is (Still) Hard

Following up my previous post, which I should have followed up immediately but didn't.

Installing on Mac OS X Lion continues to have some fun problems. I've already mentioned the problem with Vorbis; Flac, too, has problems with its configure script which can be fixed with a switch statement. To get the Flac developer libraries to install correctly, the following must be done.

./configure --disable-asm-optimizations

Thanks goes to Stack Overflow for that. Last and not least, SDL_Mixer's install also has problems. Even with Ogg Vorbis installed and functioning ( you can test that with some of the example c files it comes with ), SDL_Mixer's configure script won't recognize them. In this case, I couldn't find any command line switches to force it to work. I instead went in, and simply bypassed the configure file's test for checking if Vorbis was installed and working. I'm trying to learn more about scripting so I can see how exactly the configure script is doing its test, and then I'll go back into all -three- configure scripts, update them, and see if I can get the changes back to the guys who make this stuff.

For right now, if you find yourself having troubles, post a message, and I'll send you the modified configure script for SDL_Mixer. The reason why I'm not just flat out putting a link to it right now is because even though it's fixed, it's not fixed -correctly-, and I don't like that.

Sunday, July 8, 2012

Life On Mac is Hard

Posting this for posterity, for Mac developers. If you're not a Mac developer who treats it like a glorified Unix machine ( instead of using XCode for everything ), this might not make much sense. Move along. I'll post a recipe or something later.

This post applies to libogg 1.3.0, libvorbis 1.3.3, and was done on Mac OS X 10.7 running on a MacBook Pro. After you get done installing libogg 1.3.0, attempting to run libvorbis 1.3.3's configure script will result in the following error spew:

*** Could not run Ogg test program, checking why...
*** The test program failed to compile or link. See the file config.log for the
*** exact error that occured. This usually means Ogg was incorrectly installed
*** or that you have moved Ogg since it was installed.
configure: error: must have Ogg installed!

Which is, of course, complete nonsense, assuming you did install libogg 1.3.0 first, using proper permissions and everything. You can test that libogg 1.3.0 installed correctly using a C/C++ program if you like; I leave that as an exercise for the reader. The problem is that the configure script for libvorbis tries to build to i386 instead of x86_64. To correct this, you have to force the build script to build for x86_64 instead. Run ./configure --build=x86_64.

I freely admit I wasn't smart enough to figure out anything past 'libogg is trying to build to x86_64, while libvorbis is trying to build to i386, what's up with that?'. This forum post gave me the rest of the pieces necessary to make the whole thing work ( scroll to the bottom to find the relevant post ).

For my next trick, I might fix the configure scripts and see if I can't get the fix back to the Xiph guys, but... probably not. Still not smart enough. Working on that.

Wednesday, June 27, 2012

What Time Means On a Modern CPU

So, I've decided to go full in on game programming. This should surprise exactly nobody. I sat down and did some rapid math to see how much overhead I had.

My end game engine will be a 2d vector game engine, because I like to make my life hard for myself. The target FPS is 120 FPS; this is for no other reason than that that's the current maximum refresh rate of any monitor or display on the market that I know of. I -think- you see it in the 3d sets; I don't know, I've essentially ignored the recent 3d revolution.

Anyway. 120 FPS means that every frame has 1/120, or .0083333 (etc.) seconds to do its work. Assuming a CPU that is going at 1 GHz, that means I have eight and a third million clock cycles, per frame, to do all the work that needs to get done.

I knew modern machines were fast, who doesn't, but I still managed to be surprised to actually see it quantified. I had some kind of vague in my head idea of how fast a CPU was, but now that I see it, man, that's amazing. We've come a long way from the 10 Mhz 286 I first learned to bit bash on.

Now, I'm well aware that a 'clock cycle' is not necessarily a terribly useful metric by itself. There's lots of questions that need to go with it. How many clocks does it take to do a fetch for a given data size? How many clocks does it take to do certain math operations that are going to come up frequently in your code? So on, so on.

Still. I guess it's not that astonishing, but it is interesting (to me) anyway.