Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
cbbdd62
added name to README
Apr 13, 2017
74aaa55
finished warmup 1
Apr 13, 2017
f96ac37
renamed file
Apr 13, 2017
c7b1d4e
initial commit for Warmup 2
Apr 14, 2017
776d94b
added Struct, changed #initialize, renamed method
Apr 14, 2017
1af2dbf
starting over for Warmup 2
Apr 15, 2017
7a09c0e
implemented #tokenize and #process_token methods
Apr 16, 2017
2cb0dc6
refactored #process_token
Apr 16, 2017
8919e3e
added helper methods to check if a token is an open tag/close tag/text
Apr 16, 2017
526fdf5
implemented #build_tree method with Node struct
Apr 16, 2017
d337e90
fixed typo in #build_tree method
Apr 17, 2017
fcb4978
implemented #print_tree method
Apr 17, 2017
82696bb
finished Warmup 2
Apr 17, 2017
1ccd7dd
moved files
Apr 17, 2017
317c122
completed Warmup 3
Apr 17, 2017
ed28afb
init RSpec
Apr 17, 2017
5f6a82e
created files for DOMReader class
Apr 18, 2017
a69e730
added Guardfile
Apr 18, 2017
e225dcb
moved file
Apr 18, 2017
55e8911
defined initial specs
Apr 18, 2017
c4e7fb0
added specs/behavior for #initialize
Apr 18, 2017
bb24608
added hook, modified spec for #initialize
Apr 18, 2017
1026044
added specs/behavior for #read_file
Apr 18, 2017
45f9bc3
define specs to check for various types of text in our file
Apr 18, 2017
b374d1f
implemented specs/behavior for #is_open_tag? method
Apr 18, 2017
fd4053e
implemented specs/behavior for #is_close_tag? method
Apr 18, 2017
bf07b5a
implemented specs/behavior for #is_text? method
Apr 18, 2017
a345197
implemented specs/behavior for #is_mixed_content? method
Apr 19, 2017
4c259ac
implemented specs/behavior for #tokenize method
Apr 19, 2017
b97eb4f
implemented specs/behavior for #split_tag_and_text method
Apr 19, 2017
81eb772
added specs for tokenizing mixed content
Apr 20, 2017
49d8fe7
first attempt at tokenizing mixed content
Apr 20, 2017
b38bfcb
modified specs/behavior for #is_open_tag? to check for tag attributes
Apr 20, 2017
acddb2f
modified specs/behavior for #is_mixed_content? to check for tag attri…
Apr 20, 2017
e39ba23
additional spec for #is_text? method
Apr 20, 2017
22c64cb
reworked specs / regex for #tokenize method
Apr 20, 2017
cf4380d
removed unnecessary specs/methods
Apr 20, 2017
f7db9b3
renamed spec
Apr 20, 2017
93e35a4
added specs/behavior for #remove_doctype method
Apr 20, 2017
8c123b7
refactored #read_file method
Apr 20, 2017
73553d2
defined initial specs for #token_type method
Apr 20, 2017
b6d8db3
added example.rb to run test code
Apr 20, 2017
8524d2a
added specs/behavior for #token_type
Apr 20, 2017
8818ca0
updated test code
Apr 20, 2017
2164b6d
initial specs for DOMTree class
Apr 20, 2017
5016878
initial DOMTree class definition
Apr 20, 2017
eb5816b
added basic spec for #initialize
Apr 20, 2017
df18865
added specs/behavior for instance variables
Apr 20, 2017
7d4cf97
added specs/behavior for Node struct
Apr 20, 2017
9f9bd14
changed #remove_doctype_tag method to #is_doctype_tag? method
Apr 20, 2017
94f3b0a
changed attr_reader to attr_accessor
Apr 20, 2017
c8e2d6d
added initial specs for #build_tree
Apr 20, 2017
137404c
added initial implementation for #build_tree
Apr 20, 2017
aabc442
implemented #build_tree method
Apr 21, 2017
88e8205
modified specs for #read_file to use File stub
Apr 22, 2017
9f01c27
modified spec for #build_tree to use File stub
Apr 22, 2017
fdaeb89
modified spec for #build_tree
Apr 22, 2017
dc3d9f0
fixed bug in #build_tree with traversing up to parent node
Apr 22, 2017
3faac92
reordered methods
Apr 22, 2017
e567501
added specs/implementation for #tokenize_file method
Apr 22, 2017
c64ba8e
refactored #build_tree to use #tokenize_file
Apr 22, 2017
436ce9f
change specs order
Apr 22, 2017
0bf9b88
added specs/behavior for #remove_doctype method
Apr 22, 2017
18c29dc
refactored #build_tree method
Apr 22, 2017
617ece3
refactored spec for #build_tree
Apr 22, 2017
4845f69
added example Ruby file to run program
Apr 22, 2017
f788f04
initial specs definition for NodeRenderer class
Apr 22, 2017
a1357ef
initial NodeRenderer class definition
Apr 22, 2017
d026162
added specs/behavior for #initialize and instance variable
Apr 22, 2017
51fc7f4
added #render and #display_data_attributes methods
Apr 22, 2017
63c805f
modified spec for #render
Apr 22, 2017
969fa4e
modified test code
Apr 22, 2017
ab6e0b6
refactored #build_tree method
Apr 22, 2017
3f62023
added specs/behavior for #num_nodes_below method
Apr 23, 2017
94ee191
modified specs for #num_nodes_below method
Apr 23, 2017
15c360c
implemented specs/behaviors for #node_types_below below
Apr 23, 2017
aeb7477
removed superfluous comment
Apr 23, 2017
f9aca6c
renamed context
Apr 23, 2017
61d8ee2
modified spec for #render
Apr 23, 2017
06e615d
added #display_num_nodes_below method
Apr 23, 2017
8b44ded
added #display_node_types_below method
Apr 23, 2017
e71af9d
modified output of display methods
Apr 23, 2017
8ec000e
reworked specs for #render
Apr 23, 2017
a6d839c
initial specs/class definitions for TreeSearcher class
Apr 23, 2017
c6ab3a3
defined more specs for TreeSearcher class
Apr 23, 2017
17ec369
added spec for #search_by
Apr 23, 2017
7f1cd95
added initial implementation for #search_by with private helper method
Apr 23, 2017
f01054a
modified specs for #search_by
Apr 23, 2017
3c11d7f
fixed bugs for #search_by, for searching by tags
Apr 23, 2017
75c87f2
removed unnecessary code
Apr 23, 2017
eb9c4d8
added specs for searching nodes by text
Apr 23, 2017
99410d3
modified #search_by to add searching by text
Apr 23, 2017
992ce76
added specs/behavior for searching nodes by id
Apr 23, 2017
4b5ac57
made regex case-insensitive
Apr 23, 2017
20c0042
added specs/behavior for searching nodes by class
Apr 24, 2017
6ce71a3
removed spec
Apr 24, 2017
efe93da
added new context to specs
Apr 24, 2017
3909a43
added specs/behavior for #search_descendents method
Apr 24, 2017
9deed65
edited comment
Apr 24, 2017
d28f780
edited comment
Apr 24, 2017
61ed773
added specs/behavior for #search_ancestors method
Apr 24, 2017
ca29bd3
added DOMRebuilder class
Apr 24, 2017
2238d65
modified test code
Apr 24, 2017
f80a4c7
modified README
Apr 24, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .rspec
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
--color
--require spec_helper
--format doc
6 changes: 6 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Gemfile

source 'https://rubygems.org'

gem 'rspec', '~> 3.5.0'
gem 'guard-rspec', '~> 4.7.3', require: false
65 changes: 65 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
GEM
remote: https://rubygems.org/
specs:
coderay (1.1.1)
diff-lcs (1.3)
ffi (1.9.18)
formatador (0.2.5)
guard (2.14.1)
formatador (>= 0.2.4)
listen (>= 2.7, < 4.0)
lumberjack (~> 1.0)
nenv (~> 0.1)
notiffany (~> 0.0)
pry (>= 0.9.12)
shellany (~> 0.0)
thor (>= 0.18.1)
guard-compat (1.2.1)
guard-rspec (4.7.3)
guard (~> 2.1)
guard-compat (~> 1.1)
rspec (>= 2.99.0, < 4.0)
listen (3.1.5)
rb-fsevent (~> 0.9, >= 0.9.4)
rb-inotify (~> 0.9, >= 0.9.7)
ruby_dep (~> 1.2)
lumberjack (1.0.11)
method_source (0.8.2)
nenv (0.3.0)
notiffany (0.1.1)
nenv (~> 0.1)
shellany (~> 0.0)
pry (0.10.4)
coderay (~> 1.1.0)
method_source (~> 0.8.1)
slop (~> 3.4)
rb-fsevent (0.9.8)
rb-inotify (0.9.8)
ffi (>= 0.5.0)
rspec (3.5.0)
rspec-core (~> 3.5.0)
rspec-expectations (~> 3.5.0)
rspec-mocks (~> 3.5.0)
rspec-core (3.5.4)
rspec-support (~> 3.5.0)
rspec-expectations (3.5.0)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.5.0)
rspec-mocks (3.5.0)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.5.0)
rspec-support (3.5.0)
ruby_dep (1.5.0)
shellany (0.0.1)
slop (3.6.0)
thor (0.19.4)

PLATFORMS
ruby

DEPENDENCIES
guard-rspec (~> 4.7.3)
rspec (~> 3.5.0)

BUNDLED WITH
1.14.4
70 changes: 70 additions & 0 deletions Guardfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# A sample Guardfile
# More info at https://github.com/guard/guard#readme

## Uncomment and set this to only include directories you want to watch
# directories %w(app lib config test spec features) \
# .select{|d| Dir.exists?(d) ? d : UI.warning("Directory #{d} does not exist")}

## Note: if you are using the `directories` clause above and you are not
## watching the project directory ('.'), then you will want to move
## the Guardfile to a watched dir and symlink it back, e.g.
#
# $ mkdir config
# $ mv Guardfile config/
# $ ln -s config/Guardfile .
#
# and, you'll have to watch "config/Guardfile" instead of "Guardfile"

# Note: The cmd option is now required due to the increasing number of ways
# rspec may be run, below are examples of the most common uses.
# * bundler: 'bundle exec rspec'
# * bundler binstubs: 'bin/rspec'
# * spring: 'bin/rspec' (This will use spring if running and you have
# installed the spring binstubs per the docs)
# * zeus: 'zeus rspec' (requires the server to be started separately)
# * 'just' rspec: 'rspec'

guard :rspec, cmd: "bundle exec rspec" do
require "guard/rspec/dsl"
dsl = Guard::RSpec::Dsl.new(self)

# Feel free to open issues for suggestions and improvements

# RSpec files
rspec = dsl.rspec
watch(rspec.spec_helper) { rspec.spec_dir }
watch(rspec.spec_support) { rspec.spec_dir }
watch(rspec.spec_files)

# Ruby files
ruby = dsl.ruby
dsl.watch_spec_files_for(ruby.lib_files)

# Rails files
rails = dsl.rails(view_extensions: %w(erb haml slim))
dsl.watch_spec_files_for(rails.app_files)
dsl.watch_spec_files_for(rails.views)

watch(rails.controllers) do |m|
[
rspec.spec.call("routing/#{m[1]}_routing"),
rspec.spec.call("controllers/#{m[1]}_controller"),
rspec.spec.call("acceptance/#{m[1]}")
]
end

# Rails config changes
watch(rails.spec_helper) { rspec.spec_dir }
watch(rails.routes) { "#{rspec.spec_dir}/routing" }
watch(rails.app_controller) { "#{rspec.spec_dir}/controllers" }

# Capybara features specs
watch(rails.view_dirs) { |m| rspec.spec.call("features/#{m[1]}") }
watch(rails.layouts) { |m| rspec.spec.call("features/#{m[1]}") }

# Turnip features and steps
watch(%r{^spec/acceptance/(.+)\.feature$})
watch(%r{^spec/acceptance/steps/(.+)_steps\.rb$}) do |m|
Dir[File.join("**/#{m[1]}.feature")][0] || "spec/acceptance"
end
end
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,21 @@
Like leaves on the wind

[A data structures, algorithms, file I/O, ruby and regular expression (regex) project from the Viking Code School](http://www.vikingcodeschool.com)

Worked on by [Roy Chen](https://github.com/roychen25)

## Getting Started

To run this program, fork and clone this repository.

In the cloned directory, run this command:

```
ruby example.rb
```

The output includes:

1. Printing information about the root node of the DOM tree created

2. Rebuilding the DOM tree into its original format
15 changes: 15 additions & 0 deletions example.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
if $0 == __FILE__
require_relative './lib/dom_reader'
require_relative './lib/node_renderer'
require_relative './lib/dom_rebuilder'

dom_reader = DOMReader.new

tree = dom_reader.build_tree('./test.html')

node_renderer = NodeRenderer.new(tree)
node_renderer.render

dom_rebuilder = DOMRebuilder.new(tree)
dom_rebuilder.print_tree
end
113 changes: 113 additions & 0 deletions lib/dom_reader.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
require_relative './dom_tree'

class DOMReader
def initialize; end

def build_tree(filename)
# break file into tokens
tokens = tokenize_file(filename)

# exit if we have no tokens
return nil if tokens.empty?

# remove doctype declaration, if any
tokens = remove_doctype(tokens)

# initialize the tree
tree = DOMTree.new
current_node = tree.document

# set current depth of the tree
current_depth = 0

# create tree's root node if it does not exist yet
if tree.document.nil?
token = tokens.shift
tree.document = Node.new(token_type(token), token, current_depth, nil, [])
current_node = tree.document
end

# process remaining tokens
until tokens.empty?
token = tokens.shift
token_type = token_type(token)

node = Node.new(token_type, token, current_depth, current_node, [])
current_node.children << node

case token_type
when :open_tag
current_node = node
current_depth += 1
when :text
next
when :close_tag
current_node = current_node.parent unless current_node == tree.document
current_depth -= 1 unless current_depth == 0
when :unknown
# ignore unknown tag for now
next
end
end

tree
end

def read_file(filename)
raise "The file to be read does not exist." unless File.exist?(filename)

File.readlines(filename)
end

def tokenize_file(filename)
lines = read_file(filename)
tokens = []
lines.each { |line| tokens << tokenize(line) }
tokens.flatten!
end

def tokenize(text)
regex = /(<.+?>|[^<>]+|<\/\w+?>)/
tokens = text.scan(regex).flatten

tokens
end

def remove_doctype(tokens)
tokens.delete_at(0) if is_doctype_tag?(tokens[0])
tokens
end

def token_type(token)
return :unknown unless token.is_a?(String)

return :open_tag if is_open_tag?(token)

return :close_tag if is_close_tag?(token)

return :text if is_text?(token)

:unknown
end

def is_doctype_tag?(text)
regex = /<!doctype html>/i
!text.match(regex).nil?
end

def is_open_tag?(text)
# regex = /^<(\w+)>$/
regex = /^<(\w+)\s*(.+)*>$/
!text.match(regex).nil?
end

def is_close_tag?(text)
regex = /^<(\/\w+)>$/
!text.match(regex).nil?
end

def is_text?(text)
regex = /^[^<>\/]+$/
!text.match(regex).nil?
end
end
30 changes: 30 additions & 0 deletions lib/dom_rebuilder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
class DOMRebuilder
attr_reader :tree

def initialize(tree = nil)
@tree = tree
end

# print out the DOM tree using DFS
def print_tree(start_node = nil)
output = ""

start_node = self.tree.document if start_node.nil?

stack = []
stack.push(start_node)

until stack.empty?
current_node = stack.pop

output << current_node.content

# it's important to reverse the child nodes first before
# pushing onto the stack, so that they'll be printed out
# in the right order
current_node.children.reverse.each { |child| stack.push(child) }
end

puts output
end
end
50 changes: 50 additions & 0 deletions lib/dom_tree.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Node = Struct.new(:type, :content, :depth, :parent, :children)

class DOMTree
attr_accessor :document

def initialize(document = nil)
# root node of tree
@document = document
end

# methods for traversing tree to be added here

def num_nodes_below(node = nil)
return 0 if node.nil? || node.children.empty?

count = 0
queue = []
node.children.each { |child| queue << child }

until queue.empty?
current_node = queue.shift
count += 1
current_node.children.each { |child| queue << child unless child.nil? }
end

count
end

def node_types_below(node = nil)
return {} if node.nil? || node.children.empty?

node_types = {}

queue = []
node.children.each { |child| queue << child }

until queue.empty?
current_node = queue.shift
current_node.children.each { |child| queue << child unless child.nil? }

if node_types.keys.include?(current_node.type)
node_types[current_node.type] += 1
else
node_types[current_node.type] = 1
end
end

node_types
end
end
Loading