Styling in WxRuby’s Scintilla

In the prior tutorial I hinted that I was working on compiling wxRuby with my own Pico Lisp lexer, I gave up on that after spending a few hours editing h-files and makefiles, only to discover that the Scintilla component would not work, possibly because of my changes, probably because of something else.


That plan had another flaw as well; with every new version of wxRuby I would have to compile on my own since the lexer would not be precompiled, do you know how long it takes to compile first wxWidgets and then wxRuby? The better part of an hour! So would a ruby syntax highlighter be a such a drag? As it turned out, no. I’ve tested with fairly large files now and I do not experience any lag, even though I use a very expensive way of doing the highlighting.

More things have happened apart from the custom Pico highlighter, I’ve added lexers for CSS, HTML, PHP and Javascript too. I experienced very buggy behavior with the HTML lexer that was supposed to handle HTML, PHP and JS all at once. The HTML was not highlighted properly, attributes and values would not have their own colors for instance, neither would Javascript get any at all. Only PHP worked properly, in fact I still use the HTML lexer for PHP, I switched to the ecscript lexer for Javascript which works ok even though its kinda buggy. I decided to roll my own HTML highlighter instead but I will cover only the Pico lexer in this article, the principle is the same in both cases.

Important note, Scintilla is set to use 5 bits for styling per default but because the HTML lexer needs so many different styles for various kinds of inline scripts it requires 7 bits which you have to set manually with set_style_bits(7). If you don’t it won’t work at all, you’ll see indicators all over the place, wavy underlines and so on and no proper styling. This applies even if you use it only for PHP for instance like I do.

As you maybe know a semi colon comments in Common Lisp, but in Pico a # will do the same. Semi colon is also a shortcut in Pico for get.

Let’s start with what’s new in Scintilla.rb, both highlighting related stuff and other stuff discussed in the prior articles:

class Scintilla < Wx::StyledTextCtrl
  attr_accessor :file_path, :saved, :proj_id
  def initialize(frame, lexer, file_path, project, pr_id)
    super(frame)
    
. . .

    @ws_visible   = false
    @eol_visible  = false
    @file_path    = file_path
    @aui          = frame
    @saved        = true
    @proj_id      = pr_id
    
. . .

    evt_stc_charadded self, :onCharAdded
    evt_stc_marginclick self, :onMarginClick
    evt_stc_updateui self, :onUpdateUI
    evt_stc_modified self, :onModified
    evt_stc_styleneeded self, :onStyleNeeded
    
  end
  
. . .
  
  def replace(f_str, r_str)
    no_instances = 0
    sel_text = self.get_selected_text
    if sel_text.length > 5
      if (no_instances = sel_text.scan(f_str).length) > 0
        self.replace_selection(sel_text.gsub(f_str, r_str))
      end
    else
      if (no_instances = self.get_text.scan(f_str).length) > 0
        self.set_text(self.get_text.gsub(f_str, r_str))
      end
    end
    return no_instances
  end
  
  def onStyleNeeded(evt)
    self.start_styling(0, 31)
    @cur_lexer.startStyling(self)
  end
  
  def onSaveFile
    self.save_file(@file_path)
    @aui.toggleSaved(@saved = true)
  end
  
  def reload
    self.load_file(@file_path)
    @aui.toggleSaved(@saved = true)
  end
  
  def refreshText
    f = File.new(@file_path)
    if self.get_text != f.read
      f.close
      self.reload
    end
  end
  
 . . .
  
  def getLexer
    @cur_lexer.name
  end
  
  def getCurWord
    @cur_lexer.getCurWord(self)
  end
  
. . .  

  def onModified(evt)
    chr = self.get_char_at(self.get_current_pos)
    @cur_lexer.onModified(self, chr)
    @aui.toggleSaved(@saved = false) if (@saved && self.get_modify())
  end
  
. . .
  
  def inLines(str)
    lines = {}
    (0..self.get_line_count - 1).each do |line_nbr|
      if self.get_line(line_nbr).include?(str)
        lines[line_nbr] = (line_nbr + 1).to_s + ": " + self.get_line(line_nbr).strip
      end
    end
    return lines
  end
end

As you can see I took the time to implement replace in selection/current document. We also need to keep track of our parents, the aui and the treectrl in the form of @project which is a reference to the complete component, we also need the project id (in the treectrl) we belong to. The main thing is our mapping of onStyleNeeded to evt_stc_styleneeded which will be called whenever Scintilla feels like we need to re-highlight something.

Note that we currently don’t do anything with the event that triggers the re-styling, here we could examine exactly what has happened and then update the styling for only a part of the document to make it faster. For now though we simply style the whole thing all the time, hence the arguments to start_styling: 0 to start at the beginning and 31 to only work with the style bits, not indicator bits. The indicators can be used to draw wavy underlines for instance, in order to highlight bad code or whatever, anyway, we don’t work with them yet, only the coloring of the text itself.

In pico.rb:

class Pico < Lexer
  attr_accessor :name
  def initialize(project, proj_id, sci)
    super(:pico, project, proj_id)
    @lex_num          = STC_LEX_CONTAINER

. . .
  
  def startStyling(sci)
    (0..sci.get_line_count).each do |line_nbr|
      sci.start_styling(sci.position_from_line(line_nbr), 31)
      txt = sci.get_line(line_nbr)
      self.styleMe(txt, sci)
    end
  end
  
  def styleMe(txt, sci, flag = :def)
    if txt && txt.length > 0
      if (m = /(.*)(#.*)/.match(txt)) && (flag != :nocmt)
        a, b = m.captures
        if a.length > 0 && a.scan('"').length % 2 != 0
          self.styleMe(txt, sci, :nocmt)
        else
          self.styleMe(a, sci)
          sci.set_styling(b.length, STC_LISP_COMMENT)
        end
      elsif m = /(.*)("[^"]*")(.*)/.match(txt)
        self.styleLine(m, sci, STC_LISP_STRING)
      elsif m = /(.*)('[^()\s]+)(.*)/.match(txt)
        self.styleLine(m, sci, STC_LISP_SYMBOL)
      elsif m = /(.*)([()\s])(.*)/.match(txt)
        if m.captures[1] != ' '
          self.styleLine(m, sci, STC_LISP_OPERATOR)
        else
          self.styleLine(m, sci, STC_LISP_DEFAULT)
        end
      else
        if @project.getConfig(@proj_id, :pico).fetch(:all_keywords).include?(txt.strip)
          sci.set_styling(txt.length, STC_LISP_KEYWORD)
        elsif txt.strip =~ /^\d+$/
          sci.set_styling(txt.length, STC_LISP_NUMBER)
        elsif "'*/-+();~`".include?(txt.strip)
          sci.set_styling(txt.length, STC_LISP_OPERATOR)
        else
          sci.set_styling(txt.length, STC_LISP_DEFAULT)
        end
      end
    end
  end
  
  def styleLine(m, sci, style)
    a, b, c = m.captures
    self.styleMe(a, sci)
    sci.set_styling(b.length, style)
    self.styleMe(c, sci)
  end

. . .

end

Note the use of STC_LEX_CONTAINER as the lexer, if we don’t set this one we won’t be able to use our style needed event properly. Observe startStyling, we need to highlight line by line. This is a minor problem if we need to account for multi line comments, luckily Pico doesn’t have that 🙂

The basic approach is to simply work through the text in a recursive manner, set_styling will keep track of the last styled position and work from there. This means we have to make sure we call it in proper sequence. That’s basically it, it doesn’t require more to style a language on your own!

Related Posts

Tags: , ,