Chapter 1

Ruby Basics

Before the discussion of Ruby Internals can begin, a short discussion of the Ruby language is in order. Only a minimum set of Ruby syntax will be specified here, a more detailed discussion will be delayed until Chapter 8. For majority of people, this might all look familiar. Feel free to skip it.

Objects

Character String

All that can operate in a Ruby program are objects. For example, when you write "Content", a string object is created.

"Content"
"Content"
"Content"

Now each time "Content" was written above, a new string object was created. The objects created are not displayed, the objects simply have been created and now reside somewhere in the Object Space.

To display a string (or an object for that matter), it must be sent to an object that can display it or rather a method of an object that can display it.

  p ("Content")           #"Content" will be displayed on the console

The "#....." is a valid ruby comment and will be used in this book to indicate the results of the statement preceding it.

Various literals

Literal Numeric Objects are created as follows:

# Small Integers
1
2
100

# Extremely Large Integers
9999999999999999999999999

# Floating Point Decimals
1.0
99.999
1.3E4 # 1.3×10^4

Array Literals can be created with the following expression:

  [ 1, 2, 3 ]

Because Array elements can be literals of any type, the following is possible:

[ 1, "string", 2, [ "nested", "array" ] ]

Furthermore, the expression below creates a hash:

  {"Key" => "Value", "key2" => "Value2", "key3" => "Value3"}

A hash is a collection of 'Key'/'Value' pairs. The Key values must be unique. A hash table remembers the following correspondence:

  "Key"  -> "value"
  "Key2" -> "value2"
  "Key3" -> "value3"

Method calls

A method call is in C++ Terminology is a 'member function'. It has the form:

  "Content".upcase ()

This expression returns a new object containing "CONTENT". Method calls can be chained:

  "Content".upcase ().downcase ()

By the way, brackets are optional in ruby so the same expression can also be written as (preferrably):

  "Content".upcase.downcase

Programs Part 1

Top Level

A Ruby program can consist of a single expression. There is no need, as in C/C++ for a main () function. A simple program such as:

  p ("content")

Can be in a file or on the command line. If in the file 'first.rb':

  % ruby first.Rb
  "Content"

Programs can also be entered on the command line with the '-e' switch

  % ruby -e 'p ("content") '
  "Content"

In fact, since ruby is a one pass interpreter, even large programs can be executed by entering them through standard input.

  % ruby <first.rb
  "Content"

Additionally, if 'first.rb' specifies the file is a ruby program with the line '#!/usr/bin/ruby' at the top of the file and then makes the file executable, the following also runs the program:

  % first.rb
  "Content"

Variables

Variables and Constants in Ruby contain references to 'Objects'. Additionally, the first letter if a variable indicates its Scope. Names starting with a Capital Letter classify it as a Module Name, a Class Name or a Constant.

Naming Conventions for Variables, Constants, Class Names and Module Names

Local Variables Glbal Variables Instance Variables Class Variables Constants and Class names
name $debug @name @@total PI
fishAndChips $CUSTOMER @point @@symtab FeetPerMile
x_axis $_ @X @@N String
thx1138 $plan9 @_ @@x_pos MyClass
_26 $Global @plan9 @@SINGLE Jazz_Song

Since variables and constants are references to Object, copying one variable to another simply copies the reference pointer, not the object. Variables come into existence when they are first used. There is no need to declare them before hand.

  Str = "content"            # Constants
  Arr = [ 1,2,3 ]

Furthermore, since variables have no type, they can hold references to any type object. The following is completely legitimate.

  lvar = "content"           #Variable
  lvar = [ 1,2,3 ]
  lvar = 1

Also variable reference is extremely sensible notation.

  str = "content"
  p (str) # "content" is indicated

Copying a variable reference to another variable, simply duplicates the reference.

  a = "content"
  b = a
  c = b

After the statements above, the reference pointers(Varibles) all point the the same string object.

(Reference)

Figure 1: The Ruby variables keeps the reference to an object

Constants

The fact the a variable identifier starts with a Capital Letter, causes Ruby to treat it as a constant.

  Const = "content"
  PI = 3.1415926535

  p (Const) # "content" it is indicated

If a the program tries to change the value of a Constant, normally a warning will be produced. This means that unless to object has been frozen (Object.freeze), a constants value can be changed. However, this is very poor programming practice, and should be discouraged at all times.

Once an object has been frozen, it can not be changed for the life of the object. This applies to both variables and constants. If the program attempts to change a frozen object, an error exception will be generated.

Control structures

Ruby control structures check whether a conditional statement is true or false.

  if i < 10 then
    # itself
  end

  while i < 10 do
    # itself
  end
  Ruby has a simple definition of Truth or False. If
  conditional expression returns false or nil it is
  False. Everything else is considered True. For those
  from a C/C++ background any integer value returned
  is considered True, even an integer zero.

Classes and Methods

Classes

With Ruby, as with other object orientated languages, the concept of Class ties objects and methods together. All objects belong to a class. The root class for all ruby objects is the class Object.

By defining a object to be a member of a particular class, means that all the methods associated with that class or inherited by that class are available to the object created.

The methods Upcase and Length are methods available to Strings.

       "Content".upcase ()
"This is a pen.".upcase ()
    "Chapter II".upcase ()

       "Content".length ()
"This is a pen.".length ()
    "Chapter II".length ()

When an object calls a method, that is not defined for a object, it will search up the inheritance tree for the method. If not method can be found, a runtime exception will be generated.

  % ruby -e ' "str".bad_method'
  NoMethodError: undefined method `bad_method' for "str":String

Class Definition

A Class is defined by the keyword class followed by an identifier for the class. The Class definition is terminated by the keyword end.

  class C
  end

Once a class has been declared, it must be instantiated with the method Object#new. The following code creates a class 'C' object and create a variable 'c' to reference it.

  class C
  end
  c = C.new

Method definition

Now we will add a method to our class 'C'.

  class C
    def myupcase(str)
      str.upcase
    end
  end

Using the keyword def we declare a method C#myupcase.The method has one statement, which returns the value of the expression String#Upcase.Now. let us use our new class and method:

  c = C.new
  result = c.myupcase("content")
  p(result)                            # "CONTENT" is displayed

Remember that it can also be written as:

  p (C.new.myupcase("content"))     # "CONTENT" is displayed

Self

When a method is executing, the information about the instance that called it is retained in the variable self. The following verifies this feature:

  class C
    def get_self()
      self
    end
  end

  c = C.new
  p(c)                  # #< C:0x40274e44>
  p(c.get_self)         # #< C:0x40274e44>

The following verifies that self is the Default Object used when a method call does not specify an object. The same result occurs if self is specified explicitly.

  class C
    def self_p(obj)
      self.real_my_p(obj)         # 'self' is explicit
    end

    def my_p(obj)
      real_my_p(obj)              # 'self' is implicit
    end

    def real_my_p(obj)
      p(obj)
    end
  end

  C.new.my_p(1)                  # '1' is displayed
  C.new.self_p(1)                # '1' is displayed

Instance Variables

Instance Variable Identifiers start with a single '@'. The scope of Instance Variables is confined to the Object Instance where it is defined.

  class C
    def set_i(value)
      @i = value
    end

    def get_i
      @i
    end
  end

  c = C.new
  c.set_i("ok")
  p(c.get_i)                   # "ok" is displayed

An Instance Variable has a value of nil until is loaded with a value:

  d = C.new
  p(d.get_i)                   # "nil" is displayed

'nil' like literal, can be used as value

  p (nil)                         # "nil" is displayed

Initialize

As we have seen, if an instance variable is not loaded with a value, then it returns nil. For this and other reasons when defining a class it is often desirable to initialize various elements of class's instance when it is created. When an object is created with Object#new, ruby will call the instance method Object#initialize if it has been defined for this object (or Class).

  class C
    def initialize
      @i = "ok"
    end

    def get_i
      @i
    end
  end

  c = C.new
  p(c.get_i)                  # "ok" is displayed

Inheritance

Classes can inherit Properties and Methods form other classes. For Example, a String is a sub-class of class Object.

(Supersub) Figure 2: Inheritance

In the figure above, String is a subordinate class of Object. However, it is generally stated that the class Object is the Superclass of the class String. At this point in time that C++ and Java nomenclature is different.

To create a sub-class of any other class, the Superclass must be specified in the subordinate class declaration:

  class C < SuperClassName
  end

The new class inherits all the methods from the Superclass. Furthermore, the Superclass inherits all the methods from it's Superclass. This continues until the inheritance chain reaches the class Object, which is the root class of all classes. A short example of how classes inherit methods is shown below:

  class C
    def hello
      "hello"
    end
  end

  class Sub < C
  end

  sub = Sub.new
  p(sub.hello)                # "hello" is Displayed

As we seen before, using 'sub' is not necessary, Ruby allows methods to be chained together.

  p(Sub.new.hello)

In addition to inheriting methods, a subordinate class can override a inherited method. When there are multiple methods of the same name in the inheritance tree, the closest definition will be used.

  class C
    def hello ()
      "Hello"
    end
  end

  class Sub < C
    def hello ()
      "Hello from Sub"
    end
  end

  p(Sub.new.hello)         # "Hello from Sub" is indicated
  p(C.new.hello)           # "Hello" is indicated

Whenever a method is called, Ruby will begin searching at the current object, and if not found will continue the search up through the inheritance chain. If no matching method is found be the time the Root Class Object is searched, a 'NoMethodError' exception will be generated.

(Multiinherit)

Figure 3: Multi-stage succession

  Ruby is a Real Object Orientated Language.  The top
  object in the inheritance tree is the class Object.
  All other classes are directly or indirectly subordinate
  to the class of Object.

(Classtree) Figure 4: The class tree of Ruby

Variables and Succession

A variable (Instance Variable) cannot be inherited. To verify this see results of 'print_b' below. However, an Instance Variable used in an inherited method can be accessed. See 'print_i' below:

  class A
    def initialize
      @i = "ok"
    end
  end

  class B < A
    def print_i
      p(@i)
    end
    @b = @i

    def print_b
      p(@b)
    end
  end

  B.new.print_i                    # "ok" is displayed
  B.new.print_b                    # "nil" is displayed
  The method 'print_i' can access '@i'
  because the inherited method runs in the context
  that was present when it was defined! This is a definite
  feature of Ruby.

Modules

An Object can only point to one (1) Superclass. Ruby follows a single inheritance model. However, using Modules the ability to get many of the features of multi-inheritance is possible. Modules provide two major benefits:

  1. Modules provide a namespace and prevent name clashes.
  2. Modules implement the mixin facility.

Modules are simply containers of methods, classes, and constants. They cannot reference Superclasses or create instances of themselves. You write definition of a module as follows:

  module M
  end

With module 'M' defined, methods can be added exactly the same way as in classes.

  module M
    def myupcase(str)
      str.upcase
    end
  end

Modules cannot make instances of themselves, however classes can 'include' them in their definition. This is referred to as a Mixin. With a module included in a class, all the methods in the module become instantiated in the class.

  module M
    def myupcase(str)
      str.Upcase
    end
  end

  class C
    include M
  end

  p(C.new.myupcase("content"))     # "CONTENT" it is indicated

As we said, modules cannot reference a Superclass, but they can include other modules. Again, this approximates be able to reference a Superclass.

  module M
  end

  module M2
    include M
  end

The example modules including other modules to provide similarity of inheritance.

  module OneMore
    def method_onemore
      p("Onemore")
    end
  end

  module M
    include onemore

    def method_m
      p("M")
    end
  end

  class C
    include M
  end

  C.new.method_m             # "M" is displayed
  C.new.method_onemore       # "OneMore" is displayed

If succession was written in the same way as a class, it would look like figure 5.

(Modinherit)

Figure 5: Multi-stage inclusion

Now let us examine what happens when a Class inherits a method of the same name as one it's including via a module. Which method will be used?

  class Cls
    def test
      "class"
    end
  end

  module Mod
    def test
      "module"
    end
  end

  class C < Cls
    include Mod
  end

  p(C.new.test)            # "class" or "Module"?

By executing this program, you will find that the answer is "Module". It would seem that the Module takes precedence over the Superclass. Why?

When modules are included, they are inserted into the inheritance chain between the class and the Superclass.

(Modclass)

Figure 6: Correlation of class and module

Multiple Module inclusions also take precedence over the Superclass.

(Modclass2)

Figure 7: Correlation of class and module (2)

Program Part II

Nesting of Constants

First a review. A variable with the first letter capitalized is a Constant.

  Const = 3

A Constant is referred to as follows:

  p(Const)                       # 3 is displayed

To tell the truth you can write this as follows:

  p(::Const)                     # 3 is displayed

When a constant is prefixed with :: it indicates the the constant was defined at the top-level of the program.

The operator "::" is used for scope resolution. It has the form:

  {classs | module | }::{Constant | Class | Module}

  ::Value          -References Constant at top-Level
  Foo::Value       -References Constant in class/Module Foo
  Foo::Foo2::Value -References Constant in nested Class/Module Foo2
  class SomeClass
    Const = 3
  end

  p(::SomeClass::Const)         # 3 it indicates
  p(SomeClass::Const)           # 3 it indicates

Since class Someclass is defined a the top level, both references point the same constant.

  class C #::C
    class C2 #::C::C2
      class C3 #::C::C2::C3
      end
    end
  end

However, the next case above, contains nested class definitions. Referencing these classes (C2 and C3) may require the 'scope resultion operater'.

Everything is executed

 1:  p("first")
 2:
 3: class C < Object
 4:   Const = "in C"
 5:
 6:   p(Const)
 7:
 8:   def myupcase(str)
 9:     str.upcase
10:   end
11: end
12:
13:  p(C.new.myupcase("content"))

This program is executed in order below.

  1:  p("first")        - String"First"is created and object displayed

  3:  < Object          - The Constant reference to the to class Object is retrieved

  3:  class C           - A new class object is created and the reference to Object is substituted for the default superclass.

  4:  Const = "in C"    - Defines the constant Const which references the string "in C".

  6:  p(Const)          - Prints 'Const.inspect' for the constant reference to the string "in C"

  8:  def myupcase      - Defines the Method 'myupcase'.
        (str)...end

  9:  str.upcase        - Call method upcase for a  receiver, where the receiver is the object;specified by the input 
                          value ("in C") and returns "CONTENT".

  13: C.New.            - Defines constant 'C' to holds a reference to new
    myupcase("content")   instantiation of class 'C' and receives the value
                          of mymethod upcase.

  13: p(...)            - The string "CONTENT" is returned and displayed.

Scope of local variable

We can talk about the scope of local variables.

Inside top-level, class definition statements, inside module definition statement, each one has an independent scope.

The following example, 'lvar' four independent scopes.

  lvar = 'toplevel'               # (Scope 1)

  class C
    lvar = 'in C'                 # (Scope 2)lvar = 'in C'
    def method ()
      lvar = 'in C#method'        # (Scope 3)lvar = 'in C#method"
    end
  end

  p(lvar)                        # (Scope 1)"toplevel" is displayed

  module M
    lvar = 'in M'                 # (Scope 4)lvar = 'in M'
  end

  p (lvar)                        # (Scope 1)"toplevel" is displayed

Context of Self

We have said the while executing a method the default object is self. But at the top level, what is self.

  puts "Hello,  World"

Not an object in sight. But dig deeper, and you'll come across objects and classes lurking in even the simplest code.

We know that the literal "Hello, World" generates a Ruby String, so there's one object. We also know that the bare method call to puts is effectively the same as self.puts. But what is ``self''?

  puts self.kind_of?(Object)        # Object is displayed

At the top level, we're executing code in the context of some predefined object. When we define methods, we're actually creating (private) singleton methods for this object. Instance variables belong to this object. And because we're in the context of Object, we can use all of Object's methods (including those mixed-in from Kernel) in function form. This explains why we can call Kernel methods such as puts at the top level (and indeed throughout Ruby). These methods are part of every object.

(Kernel) Figure 8: The Root Object "Object" is inherited by all objects!

When Ruby needs to load programs and libraries it has two ways to accomplish this,;load and require. The load method always loads Ruby source files every time it is executed. Since load always loads the file specified, it can be used to reload a file that may have been altered during execution.

  load("program_name")

The require method only loads the file once and then refers to it as needed. Require is also an executable statement. For example it can be executed inside if command. This command also loads both source and compiled 'C' extension files (ie 'so'...).

  require("libary_name")

Classes Part II

Constants and Nested Classes

The following is an example of nested classes. Constants are search for first outside the class. If not found outside, then searched inside the class.

  class A1
  end
  class A2 < A1
  end
  class A3 < A2
    class B1
    end
    class B2 < B1
    end
    class B3 < B2
      class C1
      end
      class C2 < C1
      end
      class C3 < C2
        p(Const)
      end
    end
  end

Figure 9 shows the search order for the code above.

(Constref)

Figure 9: Search order for the constant

Metaclass

Metaclasses hold instance methods. These are methods that are created for a particular object, and when attached to an object, these instance methods are called singleton methods. These methods intercept calls before they trickle up the chain of inheritance.

The whole per-object behavior model of Ruby is based on the idea that every object has both a class of origin and a class of its own where the definitions exclusive to that object reside. This isn't abnormal, nor an aberration; it's the way the whole thing works.

When you want to add a method to a Particular Object you must create a metaclass (also called a Virtual or Abstract Class) to store the method. The metaclass is inserted into the inheritance path.

Using "class << m" syntax means you are opening a 'metaclass'!

  require 'yaml'
  m = MailTruck.new("Harold", ['12 Corrigan Way', '23 Antler Ave'])
  class << m
    def to_yaml_properties
      ['@driver',  '@route']
    end
  end

  # Alternative metaclass Definition:

  def m.to_yaml_properites
    ['@driver',  '@route']
  end

Meta object

The origin of Ruby Objects is a collection of Classes and Meta Classes created by Ruby before the user's program is read. These classes are inter-woven so that objects do not "run off the end!". It is on this foundation that all of the built-in classes and objects are created.

A simplified example of these core objects and how user classes are attached is shown in Figure 2 in "Introduction to the Ruby Language".

  p(class.superclass)         # module
  p(module.superclass)        # object
  p(object.superclass)        # nil

(Metaobjects) Figure 10: Origin of Ruby Objects

Singleton Methods

Singleton methods are methods that are attached to a specific instance of an object. They are stored in a metaclass attached to the object.

  obj = Object.new
  def obj.my_first
    p("Singleton Method")
  end
  obj.my_first                 # Displays "Singleton Method"

Class Variable

A Class Variable is somewhat like a Constant that can be changed. A Class Variable is accessible and changeable by all instances of the class.

  class C
    @@cvar = "ok"

    def print_cvar
      p(@@cvar)
    end

    def change_cvar
      @@cvar = "not-ok"
    end
  end

  c = C.new.print_cvar              # "ok" is displayed
  c.change_cvar
  c.print_cvar                      # "not-ok" is displayed

A class variable must be initialized by it's first usage, or an "Uninitialized Class Variable" exception will be raised.

  % ruby -e '
    class C
      @@cvar
    end
  '
  -e:3: Uninitialized class variable @@cvar in C (NameError)
  class A
    @@cvar = "ok"                 # "ok" is loaded
  end

  class B < A
    def print_cvar
      p(@@cvar)
    end
  end

  B.new.print_cvar                # "ok" is displayed

Global Variable

Global Variables are prefixed with a "$" character. They are accessible and writable from anywhere in a Ruby Program.

  $gvar = "global variable"
  p($gvar)                  # "global variable" and indication

Global Variables unlike Class Variables if not initialized simply return nil and do not produce an error exception.