Sentiment Classification on Large Movie Reviews with BigDL

Sentiment Classification on Large Movie Reviews

Sentiment Analysis is understood as a classic natural language processing problem. In this example, a large moview review dataset was chosen from IMDB to do a sentiment classification task with some deep learning approaches. The labeled data set consists of 50,000 IMDB movie reviews (good or bad), in which 25000 highly polar movie reviews for training, and 25,000 for testing. The dataset is originally collected by Stanford researchers and was used in a 2011 paper, and the highest accuray of 88.33% was achieved without using the unbalanced data. This example illustrates some deep learning approaches to do the sentiment classification with BigDL python API.

Load the IMDB Dataset

The IMDB dataset need to be loaded into BigDL, note that the dataset has been pre-processed, and each review was encoded as a sequence of integers. Each integer represents the index of the overall frequency of dataset, for instance, ‘5’ means the 5-th most frequent words occured in the data. It is very convinient to filter the words by some conditions, for example, to filter only the top 5,000 most common word and/or eliminate the top 30 most common words. Let’s define functions to load the pre-processed data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from dataset import base
import numpy as np

def download_imdb(dest_dir):
"""Download pre-processed IMDB movie review data

:argument
dest_dir: destination directory to store the data

:return
The absolute path of the stored data
"""

file_name = "imdb.npz"
file_abs_path = base.maybe_download(file_name,
dest_dir,
'https://s3.amazonaws.com/text-datasets/imdb.npz')
return file_abs_path

def load_imdb(dest_dir='/tmp/.bigdl/dataset'):
"""Load IMDB dataset.

:argument
dest_dir: where to cache the data (relative to `~/.bigdl/dataset`).

:return
the train, test separated IMDB dataset.
"""

path = download_imdb(dest_dir)
f = np.load(path)
x_train = f['x_train']
y_train = f['y_train']
x_test = f['x_test']
y_test = f['y_test']
f.close()

return (x_train, y_train), (x_test, y_test)

print('Processing text dataset')
(x_train, y_train), (x_test, y_test) = load_imdb()
print('finished processing text')
Processing text dataset
finished processing text

In order to set a proper max sequence length, we need to go througth the property of the data and see the length distribution of each sentence in the dataset. A box and whisker plot is shown below for reviewing the length distribution in words.

1
2
3
4
5
6
7
8
9
10
11
12
# Summarize review length
from matplotlib import pyplot

print("Review length: ")
X = np.concatenate((x_train, x_test), axis=0)
result = [len(x) for x in X]
print("Mean %.2f words (%f)" % (np.mean(result), np.std(result)))
# plot review length
# Create a figure instance
fig = pyplot.figure(1, figsize=(6, 6))
pyplot.boxplot(result)
pyplot.show()
Review length: 
Mean 233.76 words (172.911495)


Looking the box and whisker plot, the max length of a sample in words is 500, and the mean and median are below 250. According to the plot, we can probably cover the mass of the distribution with a clipped length of 400 to 500. Here we set the max sequence length of each sample as 500.

The corresponding vocabulary sorted by frequency is also required, for further embedding the words with pre-trained vectors. The downloaded vocabulary is in {word: index}, where each word as a key and the index as a value. It needs to be transformed into {index: word} format.

Let’s define a function to obtain the vocabulary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import json

def get_word_index(dest_dir='/tmp/.bigdl/dataset', ):
"""Retrieves the dictionary mapping word indices back to words.

:argument
path: where to cache the data (relative to `~/.bigdl/dataset`).

:return
The word index dictionary.
"""

file_name = "imdb_word_index.json"
path = base.maybe_download(file_name,
dest_dir,
source_url='https://s3.amazonaws.com/text-datasets/imdb_word_index.json')
f = open(path)
data = json.load(f)
f.close()
return data

print('Processing vocabulary')
word_idx = get_word_index()
idx_word = {v:k for k,v in word_idx.items()}
print('finished processing vocabulary')
Processing vocabulary
finished processing vocabulary

Text pre-processing

Before we train the network, some pre-processing steps need to be applied to the dataset.

Next let’s go through the mechanisms that used to be applied to the data.

  • We insert a start_char at the beginning of each sentence to mark the start point. We set it as 2 here, and each other word index will plus a constant index_from to differentiate some ‘helper index’ (eg. start_char, oov_char, etc.).

  • A max_words variable is defined as the maximum index number (the least frequent word) included in the sequence. If the word index number is larger than max_words, it will be replaced by a out-of-vocabulary number oov_char, which is 3 here.

  • Each word index sequence is restricted to the same length. We used left-padding here, which means the right (end) of the sequence will be keep as many as possible and drop the left (head) of the sequence if its length is more than pre-defined sequence_len, or padding the left (head) of the sequence with padding_value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def replace_oov(x, oov_char, max_words):
"""
Replace the words out of vocabulary with `oov_char`
:param x: a sequence
:param max_words: the max number of words to include
:param oov_char: words out of vocabulary because of exceeding the `max_words`
limit will be replaced by this character

:return: The replaced sequence
"""

return [oov_char if w >= max_words else w for w in x]

def pad_sequence(x, fill_value, length):
"""
Pads each sequence to the same length
:param x: a sequence
:param fill_value: pad the sequence with this value
:param length: pad sequence to the length

:return: the padded sequence
"""

if len(x) >= length:
return x[(len(x) - length):]
else:
return [fill_value] * (length - len(x)) + x

def to_sample(features, label):
"""
Wrap the `features` and `label` to a training sample object
:param features: features of a sample
:param label: label of a sample

:return: a sample object including features and label
"""

return Sample.from_ndarray(np.array(features, dtype='float'), np.array(label))

padding_value = 1
start_char = 2
oov_char = 3
index_from = 3
max_words = 5000
sequence_len = 500

print('start transformation')

train_rdd = sc.parallelize(zip(x_train, y_train), 2) \
.map(lambda (x, y): ([start_char] + [w + index_from for w in x] , y))\
.map(lambda (x, y): (replace_oov(x, oov_char, max_words), y))\
.map(lambda (x, y): (pad_sequence(x, padding_value, sequence_len), y))\
.map(lambda (x, y): to_sample(x, y))
test_rdd = sc.parallelize(zip(x_test, y_test), 2) \
.map(lambda (x, y): ([start_char] + [w + index_from for w in x], y))\
.map(lambda (x, y): (replace_oov(x, oov_char, max_words), y))\
.map(lambda (x, y): (pad_sequence(x, padding_value, sequence_len), y))\
.map(lambda (x, y): to_sample(x, y))

print('finish transformation')
start transformation
finish transformation

Word Embedding

Word embedding is a recent breakthrough in natural language field. The key idea is to encode words and phrases into distributed representations in the format of word vectors, which means each word is represented as a vector. There are two widely used word vector training alogirhms, one is published by Google called word to vector, the other is published by Standford called Glove. In this example, pre-trained glove is loaded into a lookup table and will be fine-tuned during the training process. BigDL provides a method to download and load glove in news20 package.

1
2
3
4
5
6
7
8
from dataset import news20
import itertools

embedding_dim = 100

print('loading glove')
glove = news20.get_glove_w2v(source_dir='/tmp/.bigdl/dataset', dim=embedding_dim)
print('finish loading glove')
loading glove
finish loading glove

For each word whose index less than the max_word should try to match its embedding and store in an array.

With regard to those words which can not be found in glove, we randomly sample it from a [-0.05, 0.05] uniform distribution.

BigDL usually use a LookupTable layer to do word embedding, so the matrix will be loaded to the LookupTable by seting the weight.

1
2
3
4
5
6
print('processing glove')
w2v = [glove.get(idx_word.get(i - index_from), np.random.uniform(-0.05, 0.05, embedding_dim))
for i in xrange(1, max_words + 1)]
w2v = np.array(list(itertools.chain(*np.array(w2v, dtype='float'))), dtype='float') \
.reshape([max_words, embedding_dim])
print('finish processing glove')
processing glove
finish processing glove

Build models

Next, let’s build some deep learning models for the sentiment classification.

As an example, several deep learning models are illustrated for tutorial, comparison and demonstration.

LSTM, GRU, Bi-LSTM, CNN and CNN + LSTM models are implemented as options. To decide which model to use, just assign model_type the corresponding string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
from nn.layer import *
from util.common import *

p = 0.2

def build_model(w2v):
model = Sequential()

embedding = LookupTable(max_words, embedding_dim)
embedding.set_weights([w2v])
model.add(embedding)
if model_type.lower() == "gru":
model.add(Recurrent()
.add(GRU(embedding_dim, 128, p))) \
.add(Select(2, -1))
elif model_type.lower() == "lstm":
model.add(Recurrent()
.add(LSTM(embedding_dim, 128, p)))\
.add(Select(2, -1))
elif model_type.lower() == "bi_lstm":
model.add(BiRecurrent(CAddTable())
.add(LSTM(embedding_dim, 128, p)))\
.add(Select(2, -1))
elif model_type.lower() == "cnn":
model.add(Transpose([(2, 3)]))\
.add(Dropout(p))\
.add(Reshape([embedding_dim, 1, sequence_len]))\
.add(SpatialConvolution(embedding_dim, 128, 5, 1))\
.add(ReLU())\
.add(SpatialMaxPooling(sequence_len - 5 + 1, 1, 1, 1))\
.add(Reshape([128]))
elif model_type.lower() == "cnn_lstm":
model.add(Transpose([(2, 3)]))\
.add(Dropout(p))\
.add(Reshape([embedding_dim, 1, sequence_len])) \
.add(SpatialConvolution(embedding_dim, 64, 5, 1)) \
.add(ReLU()) \
.add(SpatialMaxPooling(4, 1, 1, 1)) \
.add(Squeeze(3)) \
.add(Transpose([(2, 3)])) \
.add(Recurrent()
.add(LSTM(64, 128, p))) \
.add(Select(2, -1))

model.add(Linear(128, 100))\
.add(Dropout(0.2))\
.add(ReLU())\
.add(Linear(100, 1))\
.add(Sigmoid())

return model

Optimization

Optimizer need to be created to optimise the model.

Here we use the CNN model.

More details about optimizer in BigDL, please refer to Programming Guide.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from optim.optimizer import *
from nn.criterion import *

max_epoch = 4
batch_size = 64
model_type = 'gru'

init_engine()

optimizer = Optimizer(
model=build_model(w2v),
training_rdd=train_rdd,
criterion=BCECriterion(),
end_trigger=MaxEpoch(max_epoch),
batch_size=batch_size,
optim_method=Adam())

optimizer.set_validation(
batch_size=batch_size,
val_rdd=test_rdd,
trigger=EveryEpoch(),
val_method=["Top1Accuracy"])
creating: createSequential
creating: createLookupTable
creating: createRecurrent
creating: createGRU
creating: createSelect
creating: createLinear
creating: createDropout
creating: createReLU
creating: createLinear
creating: createSigmoid
creating: createBCECriterion
creating: createMaxEpoch
creating: createAdam
creating: createOptimizer
creating: createEveryEpoch

To make the training process be visualized by TensorBoard, training summaries should be saved as a format of logs.

With regard to the usage of TensorBoard in BigDL, please refer to BigDL Wiki.

1
2
3
4
5
6
7
8
9
10
import datetime as dt

logdir = '/tmp/.bigdl/'
app_name = 'adam-' + dt.datetime.now().strftime("%Y%m%d-%H%M%S")

train_summary = TrainSummary(log_dir=logdir, app_name=app_name)
train_summary.set_summary_trigger("Parameters", SeveralIteration(50))
val_summary = ValidationSummary(log_dir=logdir, app_name=app_name)
optimizer.set_train_summary(train_summary)
optimizer.set_val_summary(val_summary)
creating: createTrainSummary
creating: createSeveralIteration
creating: createValidationSummary





<optim.optimizer.Optimizer at 0x7fa651314ad0>

Now, let’s start training!

1
2
3
%%time
train_model = optimizer.optimize()
print "Optimization Done."
Optimization Done.
CPU times: user 196 ms, sys: 24 ms, total: 220 ms
Wall time: 51min 52s

Test

Validation accuracy is shown in the training log, here let’s get the accuracy on validation set by hand.

Predict the test_rdd (validation set data), and obtain the predicted label and ground truth label in the list.

1
2
3
4
5
6
7
8
9
10
11
12
13
predictions = train_model.predict(test_rdd)

def map_predict_label(l):
if l > 0.5:
return 1
else:
return 0
def map_groundtruth_label(l):
return l[0]

y_pred = np.array([ map_predict_label(s) for s in predictions.collect()])

y_true = np.array([map_groundtruth_label(s.label) for s in test_rdd.collect()])

Then let’s see the prediction accuracy on validation set.

1
2
3
4
5
6
7
correct = 0
for i in xrange(0, y_pred.size):
if (y_pred[i] == y_true[i]):
correct += 1

accuracy = float(correct) / y_pred.size
print 'Prediction accuracy on validation set is: ', accuracy
Prediction accuracy on validation set is:  0.89312

Show the confusion matrix

1
2
3
4
5
6
7
8
9
10
11
12
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sn
import pandas as pd
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_true, y_pred)
cm.shape

df_cm = pd.DataFrame(cm)
plt.figure(figsize = (5,4))
sn.heatmap(df_cm, annot=True,fmt='d')
<matplotlib.axes._subplots.AxesSubplot at 0x7fa643fe0c50>


Because of the limitation of ariticle length, not all the results of optional models can be shown respectively. Please try other provided optional models to see the results. If you are interested in optimizing the results, try different training parameters which may make inpacts on the result, such as the max sequence length, batch size, training epochs, preprocessing schemes, optimization methods and so on. Among the models, CNN training would be much quicker. Note that the LSTM and it variants (eg. GRU) are difficult to train, even a unsuitable batch size may cause the model not converge. In addition it is prone to overfitting, please try different dropout threshold and/or add regularizers (abouth how to add regularizers in BigDL please see BigDL Wiki).

Summary

In this example, you learned how to use BigDL to develop deep learning models for sentiment analysis including:

  • How to load and review the IMDB dataset
  • How to do word embedding with Glove
  • How to build a CNN model for NLP with BigDL
  • How to build a LSTM model for NLP with BigDL
  • How to build a GRU model for NLP with BigDL
  • How to build a Bi-LSTM model for NLP with BigDL
  • How to build a CNN-LSTM model for NLP with BigDL
  • How to train deep learning models with BigDL

Thanks for your reading, please enjoy the trip on using BigDL to build deep learning models.

Fractals

A Small Program generating Fractals with Haskell

This blog is about how to use the functional programming language Haskell to write a small program to generate Fractals pictures.
If you have never seen a fractal before, there are two famous examples shown in the following pictures (Mandelbrot and Julia):






You can think of the fractals in this exercise as the graph of a complicated mathematical function. You may be familiar with graphs of functions such as $f(x) = x^2$ or $g(x) = a * x + b$. A fractal is a graph of a more complicated function over more dimensions. One interesting property of fractals is that they are self-similar if you zoom close enough, you will see copies of the fractal again. This pattern arises in nature as well: every cauliflower floret has the same shape as an entire cauliflower.
You can find a lot more information about fractals on Wikipedia: http://en.wikipedia.org/wiki/Fractal

The code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
------------------------------
-- Author: Yao Zhang
------------------------------
module Fractals where

import Data.Char
import System.IO
import Data.List (intersperse, transpose)
import Data.Maybe


validColour colour = colour >= 0 && colour <=255

validRGB (RGB a b c) = all validColour [a, b, c]

ppmHeader (a, b) = "P6 " ++ show a ++ " " ++ show b ++ " 255\n"

validBitmap [] = Just(0, 0)
validBitmap a@(xs:xss) | all (all validRGB) a && all (== length xs) (map length a) = Just(length xs, length a)
| otherwise = Nothing

encodeRGB (RGB a b c) = chr a : chr b : chr c :[]

encodeBitmap = concatMap (concatMap encodeRGB)

writePPM f x | validBitmap x == Nothing = error "The bitmap is not valid."
| otherwise = writeBinaryFile f ( (ppmHeader . fromJust . validBitmap) x ++ encodeBitmap x )

mandelbrot p = iterate (nextPoint p) (0,0)

julia c = iterate (nextPoint c)

sample pss i = map (map i) pss

fairlyClose (a, b) = a * a + b * b < 100

fairlyCloseTill n f p = length (takeWhile fairlyClose (take n (f p)))

fracImage f p = palette !! (fairlyCloseTill (length palette - 1) f p)

draw pss f = sample pss (fracImage f)


type Colour = Int
data RGB = RGB Colour Colour Colour
type Bitmap = [[RGB]]

-------------------------------------------------------------------------------
-- PART ONE: BITMAPS --
-------------------------------------------------------------------------------

validColour :: Colour -> Bool


validRGB :: RGB -> Bool


ppmHeader :: (Int,Int) -> String


validBitmap :: Bitmap -> Maybe (Int,Int)


encodeRGB :: RGB -> String


encodeBitmap :: Bitmap -> String


writeBinaryFile :: FilePath -> String -> IO ()
writeBinaryFile f x = do
h <- openBinaryFile f WriteMode
hPutStr h x
hClose h

writePPM :: FilePath -> Bitmap -> IO ()


-- Here are a few example bitmaps. You can use them to test the
-- "writePPM" function.
chessboard :: Bitmap
chessboard = concat $ alternate 8 evenRow oddRow
where
blackSquare = replicate 10 (replicate 10 black)
whiteSquare = replicate 10 (replicate 10 white)
evenRow = transpose $ concat $ alternate 8 whiteSquare blackSquare
oddRow = transpose $ concat $ alternate 8 blackSquare whiteSquare
alternate n x y
| n == 0 = []
| otherwise = x : alternate (n - 1) y x

gradient :: Bitmap
gradient = map (map distance) [[(x,y) | x <- [0..size-1]] | y <- [0..size-1]]
where
size = 80
distance (x,y) =
let step = round (fromIntegral (x + y) * 255 / 158)
in RGB step step 255


-------------------------------------------------------------------------------
-- PART TWO: FRACTALS --
-------------------------------------------------------------------------------

type Point = (Float, Float)
type Fractal = Point -> [Point]

nextPoint :: Point -> Point -> Point
nextPoint (u,v) (x,y) = (x*x-y*y+u, 2*x*y+v)

mandelbrot :: Fractal


julia :: Point -> Fractal



-------------------------------------------------------------------------------
-- PART THREE: RENDERING FRACTALS --
-------------------------------------------------------------------------------

type Image = Point -> RGB

sample :: [[Point]] -> Image -> Bitmap


fairlyClose :: Point -> Bool


fairlyCloseTill :: Int -> Fractal -> Point -> Int


fracImage :: Fractal -> Image


draw :: [[Point]] -> Fractal -> Bitmap


-- The colour palette
type Palette = [RGB]

palette :: Palette
palette = take 15 (iterate f darkred) ++ replicate 5 green ++ [blue, black]
where
darkred = RGB 200 0 0
f (RGB r g b) = RGB (r + 2) (g + 10) (b)

black = RGB 0 0 0
cyan = RGB 0 255 255
blue = RGB 0 0 255
green = RGB 0 255 0
magenta = RGB 255 0 255
yellow = RGB 255 255 0
red = RGB 255 0 0
white = RGB 255 255 255

-- Useful functions for computing bitmaps from an Image
size :: Int
size = 400

for :: Int -> Float -> Float -> [Float]
for n min max = take n [min, min + delta_ ..]
where delta_ = (max-min) / fromIntegral (n-1)

grid :: Int -> Int -> Point -> Point -> [[Point]]
grid c r (xmin,ymin) (xmax,ymax) =
[[ (x,y) | x <- for c xmin xmax ] | y <- for r ymin ymax ]

-- A few examples
figure1 :: Bitmap
figure1 = draw points mandelbrot
where points = grid size size (-2.25, -1.5) (0.75, 1.5)

figure2 :: Bitmap
figure2 = draw points (julia (0.32,0.043))
where points = grid size size (-1.5, -1.5) (1.5, 1.5)

main = do
putStrLn "Writing mandelbrot bitmap..."
writePPM "mandelbrot.ppm" figure1
putStrLn "Writing julia bitmap..."
writePPM "julia.ppm" figure2
putStrLn "Done."

Deep Belief Network

One of the most popular unsupervised learning models is Restricted Boltzmann Machines (RBM), which has been used for many types of data including images [Hinton06], speech representation [Mohamed11] and moving ratings [Salakhutdinov07] . It was another success in building a generative model for bags of words that represents documents [31], which indicate its potential capacity to build a model to predict the stock with financial documents. The most widely usage of RBM is composed to build a Deep Belief Network (DBN). The learning process of RBM is usually contrastive divergence. Experiences are needed to decide the setting of numerical parameters, such the batch size, the learning rate, the momentum, the number of epochs to iterate, the layers of hidden units, and how many units in each hidden layer, the update methods being deterministically or stochastically and the initialization of weights and bias. It also needs to be considered the way to monitor the learning process and stop criteria. This section will introduce the key components and training method of our DBN model, based on the guide in [Hinton10].

Restricted Boltzmann Machines

Energy based model assigns an energy value to each possible configuration of units by an energy function. The function will be trained to make it has some properties such as making the energy of desirable configuration has energy as low as possible. Boltzmann Machines (BMs) are a type of energy model in a form of Markov Ran- dom Field whose energy function is linear for its parameters. In order to make it sufficiently powerful to express complex functions, some invisible variables as hid- den units are added. The Restricted Boltzmann Machines is a restricted version of BMs, which constrain the BMs without the connections between different vision units or connections between di↵erent hidden units. A example graph is shown in figure below.



Figure 1.1: A Restricted Boltzmann Machine


The energy function is shown below:
$$E(v,h) = - \sum_{j \in hidden}a_jhj - \sum{i \in visible} b_ivi - \sum{i,j}h_jviw{ij}$$

where $h_j$ $v_i$ represent the value of the hidden and visible units. $a_j$ is the offset of the visible layer and $bi$ is the offsets of the hidden layer. $w{ij}$ is the weights connecting the hidden and visible units. The energy model defines the probability distribution via the energy function:

$$p(\pmb{v,h}) = \frac{1}{Z}e^{-E(\pmb{v,h})}$$

where $Z$ called partition function is a sum of all possible combinations of visible and hidden vectors:

$$Z=\sum_{\pmb{v,h}} e^{-E(\pmb{v,h})}$$

The probability of a given vector can be calculated by summing all hidden units:

$$p(\pmb{v}) = \frac{1}{Z}\sum_{\pmb{h}}e^{-E(\pmb{v,h})}$$

According to the free energy function of Energy-Based Models $G(\pmb{v}) = -\log\sum_{\pmb{h}}e^{-E(\pmb{v,h})}$, we can get the free energy of RBM model:

$$G(v) = - bv - \sum{i} \log\sum{h_i} e^{h_i(a_i + W_i v)}$$

The probability assigned to a training input can be increased via adjusting $W$, $\pmb{a}$ and $\pmb{b}$ to make the energy of the training examples lower and other examples in the input space higher. Both of the visible units $\pmb{v}$ and hidden units are conditionally independent of one anther, by reason of its specific structure. It can be mathematically expressed as:

$$P(h_i = 1|\pmb{v}) = sigmoid(W_i\pmb{v} + a_i)$$
$$P(v_j = 1|\pmb{h}) = sigmoid(W_j\pmb{h} + b_j)$$

Then the free energy of a RBM can be simplified to the following equation:
$$G(\pmb{v}) = -\pmb{bv} -\sum_{i}\log{1 + e^{a_i + W_i\pmb{v}}}$$



Figure 1.2: A Restricted Boltzmann Machine


##Deep Belief Networks
Deep Belief Networks (DBNs)[Hinton06] is a greedy layer-wise form network consists of stacked RBMs, and its hierachical architecture is shown in Figure~\ref{DBN}. It is also a graphical model which learns a multi-layer representation of the training examples.



Figure 1.3: A Deep Belief Network


As a stacking autoencoder, an DBN is stacked with RBMs by greedy layer wise unsupervised feature learning[Bengio07] , and each layer is a building block trained by RBM. After the hidden layer $\pmb{h^{i}}$ is trained, in next train step it will be as the visible layer, and the successive layer $h^{i + 1}$ will be the hidden layer. The joint distribution between input $\pmb{x}$ and all hidden layers $\pmb{h^{i}} (i\in [1,n])$ can be expressed in Equation~\ref{joint} below. $n$ is the number of RBM layrs. $x$ is the first layer so it can also be viewed as $\pmb{h^{0}}$. $P(\pmb{h^{n-1},h^{n}})$ is the joint probability between visible layer $\pmb{h^{n-1}}$ and hidden layer $\pmb{h^{n}}$ which is on the top of the DBN. $P(\pmb{h^{i-1}|h^i})$ is the distribution for the visible layer $\pmb{h^{i-1}}$ conditioned on layer $\pmb{h^i}$ at level $i$.

$$P(\pmb{x, h^1,..,h^n}) = P(\pmb{h^{n-1}, h^n})\left(\prod_{i=1}^{n-1}P(\pmb{h^{i-1}|h^{i}})\right)$$
The training process of DBN is shown in Algorithm:



For the fine-tuning part in this project, a logistic regression classifier was extended to the last layer. A training label will be assigned to each input and the parameters of the DBN will be tuned by gradient descent algorithm based on the negative log likelihood cost function.

[Hinton06] Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algo- rithm for deep belief nets. Neural computation, 18(7):1527–1554, 2006.

[Mohamed11] Abdel-rahman Mohamed, Tara N Sainath, George Dahl, Bhuvana Ramabhad- ran, Geo↵rey E Hinton, Michael Picheny, et al. Deep belief networks using discriminative features for phone recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 5060– 5063. IEEE, 2011.

[Salakhutdinov07] Ruslan Salakhutdinov, Andriy Mnih, and Geo↵rey Hinton. Restricted boltz- mann machines for collaborative filtering. In Proceedings of the 24th interna- tional conference on Machine learning, pages 791–798. ACM, 2007.

[Hinton10] Geoffrey Hinton. A practical guide to training restricted boltzmann machines. Momentum, 9(1):926, 2010.

[Bengio07] Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle, et al. Greedy layer-wise training of deep networks. Advances in neural information processing systems, 19:153, 2007.

Rotate String

Given a string and an offset, rotate string by offset. (rotate from left to right)

Example
Given “abcdefg”.

offset=0 => "abcdefg"
offset=1 => "gabcdef"
offset=2 => "fgabcde"
offset=3 => "efgabcd"

Solution:

For the example offset=3 => “efgabcd”.
First reverse “abcd” in “abcdefg”, we can get “dcbaefg”.
Then reverse “efg” in “dcbaefg”, we can get “dcbagfe”.
Finally, reverse the whole string “dcbagfe”, we can get “efgabcd”.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public class Solution {
/**
* @param str: an array of char
* @param offset: an integer
* @return: nothing
*/

public void rotateString(char[] str, int offset) {
// write your code here
if (str == null || str.length == 0)
return;

offset = offset % str.length; //Important row

reverse(str, 0, str.length - 1 - offset);
reverse(str, str.length - offset, str.length - 1);
reverse(str, 0, str.length - 1);
}

public void reverse(char[] str, int start, int end){
for (int i = start, j = end; i < j; i++, j--){
char temp = str[i];
str[i] = str[j];
str[j] = temp;
}
}
}

Ugly Number II

Ugly number is a number that only have factors 2, 3 and 5.

Design an algorithm to find the nth ugly number. The first 10 ugly numbers are 1, 2, 3, 4, 5, 6, 8, 9, 10, 12…

Solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Solution {
/**
* @param k: The number k.
* @return: The kth prime number as description.
*/

//http://blog.welkinlan.com/2015/07/28/ugly-number-lintcode-java/
public long nthUglyNumber(int k) {
// write your code here
if (k == 1)
return 1;

long unglyNumber = 0;

PriorityQueue<Long> queue = new PriorityQueue<>();
queue.offer((long)2);
queue.offer((long)3);
queue.offer((long)5);

for (int i = 1; i < k; i++){
unglyNumber = queue.poll();

if (unglyNumber % 2 == 0){
queue.offer(unglyNumber * 2);
} else if (unglyNumber % 3 == 0) {
queue.offer(unglyNumber * 2);
queue.offer(unglyNumber * 3);
} else {
queue.offer(unglyNumber * 2);
queue.offer(unglyNumber * 3);
queue.offer(unglyNumber * 5);
}
}

return unglyNumber;
}
};

Digit Count

Count the number of k’s between 0 and n. k can be 0 - 9.\

Example
if n=12, k=1 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], we have FIVE 1’s (1, 10, 11, 12)

Solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution {
/*
* param k : As description.
* param n : As description.
* return: An integer denote the count of digit k in 1..n
*/

public int digitCounts(int k, int n) {
// write your code here
int count = 0;

for (int i = 0; i <= n; i++){
count += countK(i,k);
}

return count;
}

private int countK(int i, int k){
if (i == 0 && k == 0)
return 1;

int count = 0;

while (i > 0){
count += i % 10 == k ? 1 : 0;
i /= 10;
}

return count;
}
};

Good Courses

Operating System:
CS 140: Operating Systems (Standford Winter 2016)
CS240: Advanced Topics in Operating Systems (Standford)
Operating Systems (211) (Imperial College)

Machine Learning:
Course 493: Intelligent Data Analysis and Probabilistic Inference (Imperial College)
Machine Learning (course 395) (Imperial College)
Advanced Statistical Machine learning 495
CS229 Machine Learning
CS231n: Convolutional Neural Networks for Visual Recognition

Distributed System
Scalable Distributed Systems Design

Network
Networks and Communications

Java 8 Lambda Expressions

Your First Lambda Expression:

There is a general usage in Swing library: we need a action listener to response a user operation.
Before Java 8, we have to create a new inner class which implements ActionListener interface and overide its unique method actionPerformed like the following code:

1
2
3
4
5
button.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent event) {
System.out.println("button clicked");
}
});

There are two problem:

1. So many template codes
2. The codes are not easy to read without expressing the purpose of the programmers

With Java 8, we can create a lambda expression:

1
button.addActionListener(event -> System.out.println("button clicked"));

Instead of passing in an object that implements an interface, we are passing in a bock of code : a function without a name.
In this code, we do not need to explicitly declare the type of event. In Lambda Expression, java compiler can infer the type of variable event from the context.

Note: For the sake of readability and familiarity, you have the option to include the type declarations, and sometimes the compiler
just can’t work it out!

How to Spot a Lambda in a Haystack

The following code are five forms of lambda expressions as an example:

1
2
3
4
5
6
7
8
Runnable noArguments = () -> System.out.println("Hello World");
ActionListener oneArgument = event -> System.out.println("button clicked");
Runnable multiStatement = () -> {
System.out.print("Hello");
System.out.println(" World");
};
BinaryOperator<Long> add = (x, y) -> x + y;
BinaryOperator<Long> addExplicit = (Long x, Long y) -> x + y;

Using Values, Not Varialbes

When you want to pass a varible outside the anonymous method into the method, you should declared the variable as final. Making a variable final means that you can’t reassign to that variable.

1
2
3
4
5
6
final String name = getUserName();
button.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent event) {
System.out.println("hi " + name);
}
});

For lambda expression, although you do not need to set the outside variable as final, but you can still not change the value. If you changed, there will be an complie error.

1
2
3
4
5
6
7
interface Formula {
double calculate(int a);

default double sqrt(int a) {
return Math.sqrt(a);
}
}

Default Methods for Interfaces

Java 8 enables us to add non-abstract method implementations to interfaces by utilizing the default keyword. This feature is also known as Extension Methods. Here is our first example:

1
2
3
4
5
6
7
interface Formula {
double calculate(int a);

default double sqrt(int a) {
return Math.sqrt(a);
}
}

Functional Interfaces

A functional interface is an interface with a single abstract method that is used as the type of a lambda expression.

How does lambda expressions fit into Javas type system? Each lambda corresponds to a given type, specifibd by an interface. A so called functional interface must contain exactly one abstract method declaration. Each lambda expression of that type will be matched to this abstract method. Since default methods are not abstract you’re free to add default methods to your functional interface.

To ensure that your interface meet the requirements, you should add the @FunctionalInterface annotation. The compiler is aware of this annotation and throws a compiler error as soon as you try to add a second abstract method declaration to the interface.

Example:

1
2
3
4
@FunctionalInterface
interface Converter<F, T> {
T convert(F from);
}

Keep in mind that the code is also valid if the @FunctionalInterface annotation would be ommited.

Interface name Arguments Returns Example
Predicate T boolean Has this album been released yet?
Consumer T void Printing out a value
Function T R Get the name from an Artist object
Supplier None T A factory method
UnaryOperator T T Logical not (!)
BinaryOperator (T, T) T Multiplying two numbers (*)

Java 8 new properties

Java 8 Lambdas: Functional Programming For The Masses by Richard Warburton gives a short, concise and surprisingly good coverage of how Java 8 lambdas (or “closures”) work and intended to be used, with focus on simplified and safer parallelism. In later context, also gives nice overview of emerging “reactive programming” and event-driven frameworks like Vert.x.

This blog summerizes some notes about the new properties in Java 8. It is a reading note of the book ‘Java 8 Lambdas: Functional Programming for Masses’. The content links are shown below.

1. [Lambda Expressions](./)
2. [Stream]()
3. [Libraries]()
4. [Advanced Collector]()
5. [Data Parallelism]()
6. [Test, Debug and Refactor]()
7. [The Principle of Design and Architect]()

Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters. For example, the longest substring without repeating letters for “abcabcbb” is “abc”, which the length is 3. For “bbbbb” the longest substring is “b”, with the length of 1.

Leetcode link

``

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class Solution {
public int lengthOfLongestSubstring(String s) {
int[] map = new int[256]; //Map the character to the first occurence index
int max = 0;
int slow = 0;

Arrays.fill(map,-1);

for (int fast = 0; fast < s.length(); fast++){
char current = s.charAt(fast);

if (slow <= map[current]){
max = Math.max(max, fast - slow);
slow = map[current] + 1;
}

map[current] = fast;
}

max = Math.max(max, s.length() - slow);

return max;
}
}

Markdown Tutorial

When $a \ne 0$, there are two solutions to (ax^2 + bx + c = 0) and they are
$$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$

Welcome to md

1
2
Welcome to md
============



1
![md logo](http://psyyz10.github.io/projects/BC.png "Battle City!")

This is a overview of how and what you can do with the markdown language.

After every blank line there will be a paragraph

Another paragraph.

1
2
3
4
5
This is a overview of how and what you can do with the markdown language.

After every blank line there will be a paragraph

Another paragraph.

Italic, bold, and monospace.

1
*Italic*, **bold**, and `monospace`.

Basic lists:

  • first item
  • second item
1
2
3
4
5
Basic lists:

* first item
* second item
* ...

This is a quote block
and is amazing.

A new line can be created with a empty line like the one above

1
2
3
4
> This is a quote block
> and is amazing.
>
> A new line can be created with a empty line like the one above

Unicode is amazing. ☺😄😎😙

1
Unicode is amazing. ☺😄😎😙

This is a subtitle

1
2
This is a subtitle
------------

Let’s create a ordered list:

  1. first item
  2. second item
  3. third item
1
2
3
4
5
6
Let's create a ordered list:

1. first item
2. second item
3. third item
4. ...

This is a slighter smaller title

1
### This is a slighter smaller title

Nested lists? Of course!

  1. Some recipe ingredients:

    • carrots
    • celery
    • lentils
1
2
3
4
5
6
7
Nested lists? Of course!

1. Some recipe ingredients:

* carrots
* celery
* lentils

You can always link stuff like google.com, or a section heading in the current doc.

1
2
You can always link stuff like [google.com](http://google.com), or a [section heading in the current
doc](#this-is-a-subtitle).

Let’s check tables:

size material color
9 leather brown
10 hemp canvas natural
11 glass transparent
1
2
3
4
5
| size | material  | color 
|------|-----------|-----
|9 |leather |brown
|10 |hemp canvas|natural
|11 |glass |transparent

A horizontal rule


1
2
3
A horizontal rule

***

And note that you can backslash-escape any punctuation characters
which you wish to be displayed literally, ex.: `foo`, *bar*, etc.

1
2
And note that you can backslash-escape any punctuation characters
which you wish to be displayed literally, ex.: \`foo\`, \*bar\*, etc.

Latex plugin:
mathjax

$$ a_1 + a_2 = a_3 + a_4$$
$a_1$

1
2
$$ a_1 + a_2 = a_3 + a_4$$
$a_1$

Mac Configurationd

When we use a new Mac OS, we need to do the following configuration:

  • Homebrew
    A package manager for OS X

  • NodeJS
    Node.js® is a JavaScript runtime built on Chrome’s V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Node.js’ package ecosystem, npm, is the largest ecosystem of open source libraries in the world.

  • Git
    Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

  • Hexo
    Hexo is a fast, simple and powerful blog framework. You write posts in Markdown (or other languages) and Hexo generates static files with a beautiful theme in seconds.

  • Vimrc
    A good running configuartion file for vim environment.

  • ANACONDA
    Anaconda is a completely free Python distribution (including for commercial use and redistribution). It includes more than 300 of the most popular Python packages for science, math, engineering, and data analysis. See the packages included with Anaconda and the Anaconda changelog.

  • MongoDB
    MongoDB 3.2 is a giant leap forward that helps organizations standardize on a single, modern database for their new, mission-critical applications.

  • How to Set $JAVA_HOME environment variable on Mac OS X

  • Installing Apache Maven
    Jenkins Alternote

  • Iterm2-color-schemeso

  • Keycastr
    KeyCastr, an open-source keystroke visualizer

  • MySQL

  • Mac Terminal

CPP Review note

The “Hello World” Program

C version:

1
2
3
4
5
6
#include <stdio.h>
int main(int argc, char* argv[])
{

printf("Hello world!\n");
return 0;
}

C++ version:

1
2
3
4
5
6
#include <cstdio>
int main(int argc, char* argv[])
{

printf("Hello world!\n");
return 0;
}

Sizes of types

  • The size of types (in bits/bytes) can vary in C/C++
    • For different compilers/operating systems
    • In Java, sizes are standardised, across O/Ss
  • An int changes size more than other types!
    – Used for speed (not portability), but VERY popular! (fast) – Uses the most efficient size for the platform
    – 16 bit operating systems usually use 16 bit int
    – 32 bit operating systems usually use 32 bit int
    – 64 bit operating systems usually use 64 bit int
  • sizeof() operator exists to tell us the size

Chinese interview questions about technical basic knowledge

Offer 选择:

Java 基础:

  1. equals 与 == 的区别
  2. HashMap, HashTable, ConcurrentHashMap 三者的区别
  3. TreeMap, HashMap, LinkedHashMap 的区别
  4. Collection 包结构与 Collections 的区别
    • java.util.Collection 是一个集合接口。它提供了对集合对象进行基本操作的通用接口方法。Collection接口在Java 类库中有很多具体的实现。Collection接口的意义是为各种具体的集合提供了最大化的统一操作方式。
    • java.util.Collections 是一个包装类。它包含有各种有关集合操作的静态多态方法。此类不能实例化,就像一个工具类,服务于Java的Collection框架。
  5. Java 面向对象的三个特征及其含义
  6. 解释和比较 override and overload
  7. Interface 和 absctract class 的区别
  8. java 多态的实现原理
  9. 锁的等级: 方法锁, 对象锁, 类锁
  10. wait()和 sleep() 的区别
  11. java 反射 (reflection) 的作用和原理
    *
  12. Concurrent 包里
  13. for each 和 for 循环的效率比较
  14. 设计模式:单例, 工厂, 适配器, 责任链, 观察者等
  15. ThreadPool 的用法与优势
  16. ThreadLocal 的设计理念与作用
  17. hashcode and equals
  18. 什么情况下一个object不会被回收
  19. 写一个服务器log程序
  20. 内存溢出(out of memory)和内存泄漏(memory leak):
  21. 找公共父节点
  22. Java Object 自带方法
  23. String Buffer 和 String Builder 的区别

多线程

  1. Synchronize 的机制
  2. volatile 关键字
  3. Synchronized 适用情况
  4. 写出单例线程安全的懒汉模式
  5. 写出生产者消费者模型
  6. 解决死锁

Linux 常用指令

  1. 数据库

  2. SQL 索引
    MySQL索引原理及慢查询优化 MySQL索引类型总结和使用技巧以及注意事项
    理解MySQL——索引与优化 MySQL引擎
    *mysql索引总结—-mysql 索引类型以及创建
  3. mySQL事务关系, 数据库事务

测试

  1. Python

  2. python 面试题

Java 和 C++的区别

Java 1.8 新特性

  1. JVM

  2. 内存溢出(out of memory)和内存泄漏(memory leak):
  3. 类加载器:
  4. JVM调优

clone 方法的实现,
浏览器敲回车会发生的过程

##

sort

  1. Data Structure and algorithm

  2. Priority Queue
  3. TreeMap and TreeSet
  4. 海量数据处理问题

JavaEE

  1. Apache Maven 入门篇 ( 上 )

操作系统

  1. 线程与进程的区别
  2. 进程之间的通讯方式
  3. 函数调用过程栈帧变化
  4. 线程之间访问共享资源

网络协议:

  1. 三次握手
    参考
  2. 从客户端发送信息到服务器端的过程模拟
  3. get, put, post 的区别
    GET,POST,PUT,DELETE的区别
  4. Cookie 和 Session
    Cookie/Session机制详解

LRU Cache

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and set.

get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
set(key, value) - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

Leetcode link
Lintcode likn

Solution:

To increase the efficiency, use a hash map to record each key to the corresponding node.
Use a double linked the list to store the key value pair, because it is quick for inserting and deleting nodes.
For each get operation, move node which contains the key to the head of the list.
For each set operation, if it is a new key, and the size will be greater than the capacity,
then delete the last node, and insert the new node to the head.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
/*************************************************************************
> File Name: LRUCache.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 20 Nov 13:34:31 2015
************************************************************************/


public class LRUCache{
private class ListNode{
int key;
int value;
ListNode prev;
ListNode next;

ListNode(int key, int value){
this.key = key;
this.value = value;
this.prev = prev;
this.next = next;
}
}

HashMap<Integer, ListNode> map = new HashMap<Integer, ListNode>();
ListNode head;
ListNode tail;
int capacity;

public LRUCache(int capacity) {
this.capacity = capacity;
head = new ListNode(-1,-1);
tail = new ListNode(-1,-1);
head.next = tail;
tail.prev = head;
}

public int get(int key) {
if (!map.containsKey(key))
return -1;

ListNode node = map.get(key);
node.prev.next = node.next;
node.next.prev = node.prev;
moveToHead(node);

return node.value;
}

public void set(int key, int value) {
ListNode node = null;

if (get(key) != -1){ // Note this, use get key to update the node order
map.get(key).value = value;
return;
}

if (map.size() == capacity){
map.remove(tail.prev.key); //
tail.prev.prev.next = tail;
tail.prev = tail.prev.prev;
}


node = new ListNode(key,value);
map.put(key, node);


moveToHead(node);
}

public void moveToHead(ListNode node){
node.next = head.next;
node.next.prev = node;
node.prev = head;
head.next = node;
}
}

Reorder List

Given a singly linked list L: L0→L1→…→Ln-1→Ln,
reorder it to: L0→Ln→L1→Ln-1→L2→Ln-2→…

You must do this in-place without altering the nodes’ values.

For example,
Given {1,2,3,4}, reorder it to {1,4,2,3}.

Leetcode link
Lintcode link

Solution:

First, we can find the middle point of the list.
Then reverse the list behind the middle point.
Finally merge them together.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
/*************************************************************************
> File Name: ReorderList.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 20 Nov 13:21:59 2015
************************************************************************/


public class ReorderList{
public void reorderList(ListNode head) {
if (head == null)
return;

ListNode mid = findMiddle(head);
ListNode tail = reverse(mid.next);
mid.next = null;

merge(head, tail);
}

public void merge(ListNode node1, ListNode node2){
ListNode dummy = new ListNode(-1);
while (node1 != null && node2 != null){
dummy.next = node1;
node1 = node1.next;
dummy.next.next = node2;
node2 = node2.next;
dummy = dummy.next.next;
}

if (node1 != null)
dummy.next = node1;
else if (node2 != null)
dummy.next = node2;
}

public ListNode reverse(ListNode head){
ListNode prev = null;

while (head != null){
ListNode temp = head.next;
head.next = prev;
prev = head;
head = temp;
}

return prev;
}

public ListNode findMiddle(ListNode head){
ListNode fast = head;
ListNode slow = head;

while (fast != null && fast.next != null){
fast = fast.next.next;
slow = slow.next;
}

return slow;
}
}

Linked List Cycle II

Given a linked list, return the node where the cycle begins. If there is no cycle, return null.

Note: Do not modify the linked list.

Follow up:
Can you solve it without using extra space?

Leetcode link
Lintcode link

Solution:

We can use two pointers to go through the list.
The fast pointer has the twice speed as the slow pointer.
If their is a circle, they will meet eventrually.
Assume the length of the head to the loop start point is k, and the loop size is n.
When the slow pointer get the beginning loop node, it has gone k nodes, and the fast pointer go 2k nodes.
The distance between the two pointers is k % n, in other sise is (n - k % n).
When the fisrt meet on the circle, they will meet at the point which is (n - k % n) nodes far from the beginning loop point.
Then let the slow pointer go back to the head, and they start to move with the same speed.
When the slow pointer goes k nodes distance, it get to the beginning loop point.
The fast pointer goes further (k % n) distance, which is (n - k % n + k % n) = n, so it is also the beginning loop point, and also they meet up.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/*************************************************************************
> File Name: LinkedListCycleII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 20 Nov 11:52:45 2015
************************************************************************/


public class LinkedListCycleII{
public ListNode detectCycle(ListNode head) {
if (head == null)
return null;

ListNode fast = head;
ListNode slow = head;

while (fast != null && fast.next != null){
fast = fast.next.next;
slow = slow.next;

if (fast == slow)
break;
}

if (fast == null || fast.next == null)
return null;

slow = head;

while (fast != slow){
fast = fast.next;
slow = slow.next;
}

return fast;
}
}

Linked List Cycle

Given a linked list, determine if it has a cycle in it.
Example
Given -21->10->4->5, tail connects to node index 1, return true

Challenge
Follow up:
Can you solve it without using extra space?

Solution:

This is an easy problem. Just use two pointers to point to head.
One pointer go twice speed of the second pointer.
If they meet before the fast touch the end of the list, return true;
Else return false;

Leetcode link
Lintcode link

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/*************************************************************************
> File Name: LinkedListCycle.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 20 Nov 11:42:04 2015
************************************************************************/


public class LinkedListCycle{
public boolean hasCycle(ListNode head) {
if (head == null)
return false;

ListNode fast = head;
ListNode slow = head;

while (fast != null && fast.next != null){
fast = fast.next.next;
slow = slow.next;

if (slow == fast)
return true;
}

return false;
}
}

Copy List with Random Pointer

A linked list is given such that each node contains an additional random pointer which could point to any node in the list or null.

Return a deep copy of the list.

Solution:

Use a hash to record each node in the origin list as a key
The value is the deep copy of corresponding node.
Go through the list, for each node and random reference, check the map,
if the key was included in the map, get the value, or create a new node with the same label, and put it into the map.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
/*************************************************************************
> File Name: CopyListwithRandomPointer.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 19 Nov 23:02:20 2015
************************************************************************/


public class CopyListwithRandomPointer{
public RandomListNode copyRandomList(RandomListNode head) {
if (head == null)
return null;

HashMap<RandomListNode, RandomListNode> map = new HashMap<RandomListNode, RandomListNode>();
RandomListNode dummy = new RandomListNode(-1);
RandomListNode prev = dummy, newNode;


while (head != null){
if (map.containsKey(head))
newNode = map.get(head);
else{
newNode = new RandomListNode(head.label);
map.put(head,newNode);
}
prev.next = newNode;

if (head.random != null){
if (map.containsKey(head.random))
newNode.random = map.get(head.random);
else{
newNode.random = new RandomListNode(head.random.label);
map.put(head.random, newNode.random);
}

}

prev = prev.next;
head = head.next;
}

return dummy.next;
}
}

Reverse Nodes in k-Group

Given a linked list, reverse the nodes of a linked list k at a time and return its modified list.

If the number of nodes is not a multiple of k then left-out nodes in the end should remain as it is.

You may not alter the values in the nodes, only nodes itself may be changed. Only constant memory is allowed.

Example
Given this linked list: 1->2->3->4->5

For k = 2, you should return: 2->1->4->3->5

For k = 3, you should return: 3->2->1->4->5

Leetcode link
Lintcode link

Solution:

Go through the list to obtain the length of the list.
Reverse the successive k nodes if the lenght of the right side is greater than k.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
/*************************************************************************
> File Name: ReverseNodesinK-Group.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 18 Nov 23:51:11 2015
************************************************************************/


public class ReverseNodesinK-Group{
public ListNode reverseKGroup(ListNode head, int k) {
if (k == 0 || k == 1)
return head;
ListNode dummy = new ListNode(-1);
dummy.next = head;

int length = 0;
while (head != null){
length++;
head = head.next;
}

head = dummy;

while (length >= k){
head = reverse(head, k);
length -= k;
}

return dummy.next;
}

public ListNode reverse(ListNode node, int k){
ListNode head = node.next;
ListNode startNode = head;
ListNode prev = null;

for (int i = 0; i < k ; i++){
ListNode temp = head.next;
head.next = prev;
prev = head;
head = temp;
}
startNode.next = head;
node.next = prev;

return startNode;
}
}

Swap Nodes in Pairs

Given a linked list, swap every two adjacent nodes and return its head.

Example
Given 1->2->3->4, you should return the list as 2->1->4->3.

Challenge
Your algorithm should use only constant space. You may not modify the values in the list, only nodes itself can be changed.

Solution:

This is an easay question.
If there are more than two nodes behind, exchange the order of the following two nodes,
and then move the http://www.lintcode.com/en/problem/swap-nodes-in-pairs/pointer 2 places.

Leetcode link
Lintcode link

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*************************************************************************
> File Name: SwapNodesinPairs.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 18 Nov 23:21:09 2015
************************************************************************/


public class SwapNodesinPairs{
public ListNode swapPairs(ListNode head) {
// Write your code here
ListNode dummy = new ListNode(-1);
dummy.next = head;
head = dummy;

while (head.next != null && head.next.next != null){
ListNode temp = head.next;
head.next = temp.next;
temp.next = head.next.next;
head.next.next = temp;
head = head.next.next;
}

return dummy.next;
}
}

Rotate List

Given a list, rotate the list to the right by k places, where k is non-negative.

Have you met this question in a real interview? Yes

Example
Given 1->2->3->4->5 and k = 2, return 4->5->1->2->3.

Leetcode
Lintcode

Solution:

Because k may be greater the length of the list, we need to caculate the length of the list first.
Then we can say rotate the list by (k % length) places.
Next, we Link the head and tail of the list as a circle.
Continuely, go through (length - k % length) places, and break the connection at that node.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/*************************************************************************
> File Name: RotateList.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 18 Nov 16:47:13 2015
************************************************************************/


public class RotateList{
public ListNode rotateRight(ListNode head, int k) {
if (head == null)
return head;

int length = 1;

ListNode current = head;
while (current.next != null){
length++;
current = current.next;
}

current.next = head;
k = length - k % length ;

while (k > 0){
current = current.next;
k--;
}

head = current.next;
current.next = null;

return head;
}
}

Remove Duplicates from Sorted List II

Given a sorted linked list, delete all nodes that have duplicate numbers, leaving only distinct numbers from the original list.

Have you met this question in a real interview? Yes
Example
Given 1->2->3->3->4->4->5, return 1->2->5.
Given 1->1->1->2->3, return 2->3.

Leetcode link
Lintcode link

Solution:

Go through the list, use a variable ‘prevPre’ as a ponter to point to the node before previous node.
If the middle two nodes have the same value, delete the two nodes and record the value. Continue on the list, if the following nodes have the same value as this value, delete the following nodes.
Iterate the previous steps untils there are no node behind.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/*************************************************************************
> File Name: RemoveDuplicatesfromSortedListII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 18 Nov 15:27:20 2015
************************************************************************/


public class RemoveDuplicatesfromSortedListII{
public ListNode deleteDuplicates(ListNode head) {
if (head == null || head.next == null)
return head;

ListNode dummy = new ListNode(-1);
dummy.next = head;
ListNode prevPre = dummy;

while (prevPre.next != null && prevPre.next.next != null){
if (prevPre.next.val == prevPre.next.next.val){
int value = prevPre.next.val;
prevPre.next = prevPre.next.next.next;

while (prevPre.next != null && prevPre.next.val == value){
prevPre.next = prevPre.next.next;
}
}
else{
prevPre = prevPre.next;
}
}

return dummy.next;
}
}

String to Integer(atoi)

Implement function atoi to convert a string to an integer.

If no valid conversion could be performed, a zero value is returned.

If the correct value is out of the range of representable values, INT_MAX (2147483647) or INT_MIN (-2147483648) is returned.

Example
"10" => 10

"-1" => -1

"123123123123123" => 2147483647

"1.0" => 1

Leetcode
Lintcode

Solution:

Detailed problem.
Three derails should be noted:

1. valid char c : c - '0',
2. not valid char: stop
3. '+' or '-' sign at the start, record it
4. Overflow: INT_MAX (2147483647) or INT_MIN (-2147483648) is returned
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
public class Solution {
public int atoi(String str) {
// write your code here
if (str == null || str.length() == 0)
return 0;
int sign = 1;
int index = 0;

// if we use long, we should return MAX or MIN value directly,
double num = 0;


str = str.trim();

if (str.charAt(index) == '+')
index++;
else if (str.charAt(index) == '-'){
sign = -1;
index++;
}

for (; index < str.length(); index++){
char current = str.charAt(index);
if (current > '9' || current < '0' || num > Integer.MAX_VALUE)
break;
num = num * 10 + (current - '0');
}

return (int) (sign * num);
}


// If double and long are not allowed
public int atoi(String str) {
// write your code here
if (str == null || str.length() == 0)
return 0;
int sign = 1;
int index = 0;
int num = 0;

str = str.trim();

if (str.charAt(index) == '+')
index++;
else if (str.charAt(index) == '-'){
sign = -1;
index++;
}

for (; index < str.length(); index++){
char current = str.charAt(index);
if (current > '9' || current < '0')
break;
if (num > Integer.MAX_VALUE / 10
|| (num == Integer.MAX_VALUE / 10 && num > Integer.MAX_VALUE % 10))
return sign > 0? Integer.MAX_VALUE : Integer.MIN_VALUE;
num = num * 10 + (current - '0');
}

return (int) (sign * num);
}
}

Remove Duplicates from Sorted List

Problem:

Given a sorted linked list, delete all duplicates such that each element appear only once.

Example
Given 1->1->2, return 1->2.
Given 1->1->2->3->3, return 1->2->3.

Leetcode link
Lintcode link

Solution:

This is an eassy proble, just go through the list, and remove the node whose value equals the value of previous node.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/*************************************************************************
> File Name: RemoveDuplicatesfromSortedList.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 17:23:00 2015
************************************************************************/


public class RemoveDuplicatesfromSortedList{
public ListNode deleteDuplicates(ListNode head) {
if (head == null) return null;

ListNode prev = head;

while (prev.next != null){
if (prev.next.val != prev.val){
prev = prev.next;
} else {
prev.next = prev.next.next;
}
}

return head;
}
}

Partition List

Problem:

Given a linked list and a value x, partition it such that all nodes less than x come before nodes greater than or equal to x.

You should preserve the original relative order of the nodes in each of the two partitions.

For example
Given 1->4->3->2->5->2 and x = 3,
return 1->2->2->4->3->5.

Leetcode link
Lintcode link

Solution:

It is an easy problem.
Just store the partition the linkedlist into to sub-linked list.
Finally, join them together.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/*************************************************************************
> File Name: PartitionList.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 16:55:48 2015
************************************************************************/


public class PartitionList{
public ListNode partition(ListNode head, int x) {
ListNode current = head;
ListNode leftDummy = new ListNode(-1);
ListNode leftEnd = leftDummy;
ListNode rightDummy = new ListNode(-1);
ListNode rightEnd = rightDummy;

while (current != null){
if (current.val < x){
leftEnd.next = current;
leftEnd = leftEnd.next;
} else {
rightEnd.next = current;
rightEnd = rightEnd.next;
}
current = current.next;
}

leftEnd.next = rightDummy.next;
rightEnd.next = null;

return leftDummy.next;
}
}
}

Reverse Linked List II

Problem:s

Reverse a linked list from position m to n.

*Example*
Given 1->2->3->4->5->NULL, m = 2 and n = 4, return 1->4->3->2->5->NULL.

Note
Given m, n satisfy the following condition: 1 ≤ m ≤ n ≤ length of list.

Challenge
Reverse it in-place and in one-pass

Leetcode link
Lintcode link

Solution:

This problem is complex which has many edge cases so that it is not easy to write a bug free program within a short time.
The idea of the solution is to go through the list, to the m, store the previous node before m, and start reversing from m to n.
Then link the non-reversed and reversed part together.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
/*************************************************************************
> File Name: ReverseLinkedListII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 14:48:56 2015
************************************************************************/


public class ReverseLinkedListII{
// Solution 1
public static ListNode reverseBetween(ListNode head, int m, int n) {
if (m>=n) return head;

ListNode current = head;
int i;
ListNode startNode = null, newNode = null, previousNode = null;
for (i = 1; i <= n; i++){
if (i == m - 1){
previousNode = current;
}
if (i == m){
startNode = newNode = current;
}

if (i>m){
ListNode temp = current.next;
current.next = newNode;
newNode = current;
current = temp;
continue;
}

current = current.next;
}

startNode.next = current;

if (previousNode == null) head = newNode;
else previousNode.next = newNode;

return head;
}

// Solution 2
public ListNode reverseBetween(ListNode head, int m, int n) {
if (m >= n || head == null)
return head;

ListNode dummy = new ListNode(-1);
dummy.next = head;
ListNode prev = dummy;

for (int i = 0; i < m - 1 ; i++){
if (prev == null) {
return dummy.next;
}
prev = prev.next;
}

ListNode mPrev = prev;
ListNode current = prev.next;
ListNode last = new ListNode(current.val);
ListNode first = last;
current = current.next;

for (int i = 0; i < n - m ; i++){
if (current == null) {
return dummy.next;
}
ListNode temp = current.next;
current.next = first;
first = current;
current = temp;
}

mPrev.next = first;

last.next = current;

return dummy.next;
}
}

Add Two Numbers II

Problem:

You have two numbers represented by a linked list, where each node contains a single digit. The digits are stored in forward order, such that the 1’s digit is at the head of the list. Write a function that adds the two numbers and returns the sum as a linked list.

Lintcode link

Example
Given 6->1->7 + 2->9->5. That is, 617 + 295.

Return 9->1->2. That is, 912.

Solution:

Revese the given two linked list.
Then use the solution presented in the solution of “Add Two Numbers” to add then together.
Finally, reverse the result linked list back.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
/*************************************************************************
> File Name: AddTwoNumbersII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 14:27:38 2015
************************************************************************/


public class AddTwoNumbersII{
/**
* @param l1: the first list
* @param l2: the second list
* @return: the sum list of l1 and l2
*/

public ListNode addLists2(ListNode l1, ListNode l2) {
l1 = reverse(l1);
l2 = reverse(l2);

return reverse(addLists1(l1,l2));
}

public ListNode reverse(ListNode l){
ListNode head = null;

while (l != null){
ListNode temp = l.next;
l.next = head;
head = l;
l = temp;
}

return head;
}

public ListNode addLists1(ListNode l1, ListNode l2){
if (l1 == null && l2 == null)
return null;

ListNode head = new ListNode(0);
ListNode current = head;
int carry = 0;

while (l1 != null || l2 != null){
int sum = carry;

if (l1 != null){
sum += l1.val;
l1 = l1.next;
}

if (l2 != null){
sum += l2.val;
l2 = l2.next;
}

carry = sum / 10;
current.next = new ListNode(sum % 10);
current = current.next;
}

if (carry != 0){
current.next = new ListNode(carry);
}

return head.next;
}
}

Add Two Numbers

Problem:

You have two numbers represented by a linked list, where each node contains a single digit. The digits are stored in reverse order, such that the 1’s digit is at the head of the list. Write a function that adds the two numbers and returns the sum as a linked list.

Have you met this question in a real interview? Yes

Example
Given 7->1->6 + 5->9->2. That is, 617 + 295.

Return 2->1->9. That is 912.

Given 3->1->5 and 5->9->2, return 8->0->8.

Solution:

This is an easay problem. Use a carry varible to store the carray and go through the two list from right to left doing addtion.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*************************************************************************
> File Name: AddTwoNumbers.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 12:07:30 2015
************************************************************************/


public class AddTwoNumbers{
public ListNode addTwoNumbers(ListNode l1, ListNode l2) {
if (l1 == null && l2 == null)
return null;

int carry = 0;
ListNode head = new ListNode(0);
ListNode current = head;

while (l1 != null || l2 != null){
int sum = carry;

if (l1 != null){
sum += l1.val;
l1 = l1.next;
}
if (l2 != null){
sum += l2.val;
l2 = l2.next;
}

carry = sum / 10;
current.next = new ListNode(sum % 10);
current = current.next;
}

if (carry != 0){
current.next = new ListNode(carry);
}

return head.next;
}
}

Single Number III

Problem:

Given 2*n + 2 numbers, every numbers occurs twice except two, find them.

Have you met this question in a real interview? Yes

Example
Given [1,2,2,3,4,4,5,3] return 1 and 5

Note:

  1. The order of the result is not important. So in the above example, [5, 1] is also correct.
  2. Your algorithm should run in linear runtime complexity. Could you implement it using only constant space complexity?

Challenge
O(n) time, O(1) extra space.

Leetcode link
Lintcode link

Solution:

This problem is also about bit operation.
First we xor all the numbers in the array.
Then we can get the last digits which equals 1.
The bit represent the last different bit of the two exceptional numbers.
Next, we use the bit to separate the numbers in the array into two group, and use xor to link the numbers in each group.
After the iteration stop, each number left in each group should be the two results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/*************************************************************************
> File Name: SingleNumberIII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Mon 16 Nov 11:04:57 2015
************************************************************************/


public class SingleNumberIII{
public int[] singleNumber(int[] nums){
int[] result = new int[2];
int temp = 0;
for (int i = 0; i < nums.length; i++){
temp ^= nums[i];
}

int lastBit = temp - (temp & (temp - 1));

for (int i = 0; i < nums.length; i++){
if ((nums[i] & lastBit) == 0){
result[0] ^= nums[i] ;
} else {
result[1] ^= nums[i] ;
}
}

return result;
}
}

Single Number II

Problems:

Given 3*n + 1 numbers, every numbers occurs triple times except one, find it.

Example
Given [1,1,2,3,3,3,2,2,4,1] return 4

Challenge
One-pass, constant extra space.

Leetcode link
Lintcode link

Solution:

This problem examines the ability of applying bit operation.
First, we create an integer array with size of 32, each int represent a bit of an integer.
Then for each bit position of an integer, we go through the input numbers, and accululate the value of the correpsonding bit.
As we know, the sum of each bit shold be dividable by 3 except one number.
So we mod 3 for each bit, then the left number is the result.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/*************************************************************************
> File Name: SingleNumberII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Sun 15 Nov 23:40:03 2015
************************************************************************/


public class SingleNumberII{
public int singleNumber(int[] nums){
int result = 0;
int[] bits = new int[32];

for (int i = 0; i < bits.length; i++){
for (int j = 0; j < nums.length; j++){
bits[i] += (nums[j] >> i) & 1;
bits[i] %= 3;
}

result |= bits[i] << i;
}

return result;
}
}

Single number

Problem:

Given 2*n + 1 numbers, every numbers occurs twice except one, find it.

Have you met this question in a real interview? Yes
Example
Given [1,2,2,1,3,4,3], return 4

Challenge
One-pass, constant extra space.

Leetcode link
Lintcode link

Solution:

As we know, each two same integer after xor operation will lead to 0x00000000.
And if x is an integer, 0x00000000 ^ x = x.
So we can take the adantage of xor property to solve this problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/*************************************************************************
> File Name: SingleNumber.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Sun 15 Nov 23:21:17 2015
************************************************************************/


public class SingleNumber{
public int singNumber(int[] A){
if (A == null || A.length == 0){
return 0;
}

int result = A[0];

for (int i = 1; i < A.length; i++){
result = result ^ A[i];
}

return result;
}
}

Candy

Problem:

There are N children standing in a line. Each child is assigned a rating value.

You are giving candies to these children subjected to the following requirements:

Each child must have at least one candy.

Children with a higher rating get more candies than their neighbors.

What is the minimum candies you must give?

Have you met this question in a real interview? Yes

Example
Given ratings = [1, 2], return 3.

given ratings = [1, 1, 1], return 3.

Given ratings = [1, 2, 2], return 4. ([1,2,1]).

Leetcode link
Lintcode link

Solution:

Initialize the candy array to 1s.
First go through from left to right;
Second go through form right to left with the following condition:

  • If the current child’s rating is greater than the previous one, the current candies should be at least 1 more than the previous one
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/*************************************************************************
> File Name: Candy.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Sun 15 Nov 22:03:51 2015
************************************************************************/


public class Candy{
public int candy(int[] ratings) {
if (ratings == null || ratings.length == 0)
return 0;

int length = ratings.length;
int[] candies = new int[length];
Arrays.fill(candies,1);

for (int i = 1; i < length; i++){
if (ratings[i] > ratings[i-1]) {
candies[i] = candies[i-1] + 1;
}
}

int sum = candies[length - 1];
for (int i = length - 2; i >= 0; i--){
if (ratings[i] > ratings[i+1]){
// Note that candies[i] may greater than (candies[i+1] + 1), so we use the max one
candies[i] = Math.max(candies[i], candies[i+1] + 1);
}
sum += candies[i];
}

return sum;
}
}

Gas Station

Problem:

There are N gas stations along a circular route, where the amount of gas at station i is gas[i].

You have a car with an unlimited gas tank and it costs cost[i] of gas to travel from station i to its next station (i+1). You begin the journey with an empty tank at one of the gas stations.

Return the starting gas station’s index if you can travel around the circuit once, otherwise return -1.

Have you met this question in a real interview? Yes
Example
Given 4 gas stations with gas[i]=[1,1,3,1], and the cost[i]=[2,2,1,1]. The starting gas station’s index is 2.

Note
The solution is guaranteed to be unique.

Challenge
O(n) time and O(1) extra space

Solution:

Use a ‘total’ variable to record the total amount of gas left through the circus.
Use a ‘sum’ variable to record the current gas left in the tank.
Use a pointer to record the starting label.
Once the there is no gass left in the car before next gas station available, set the label next gas station as the new starting gas station.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/*************************************************************************
> File Name: GasStation.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Sun 15 Nov 19:33:24 2015
************************************************************************/


public class GasStation{
public int canCompleteCircuit(int[] gas, int[] cost){
int sum = 0;
int total = 0;
int label = 0;

for (int i = 0; i < gas.length; i++){
total += gas[i];
sum += gas[i];

total -= cost[i];
sum -= cost[i];

if (sum < 0){
sum = 0;
label = i + 1;
}
}

return total >= 0 ? label : -1;
}
}

Set Matrix Zeroes

Problem:

Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in place.

Have you met this question in a real interview? Yes
Example
Given a matrix

[
  [1,2],
  [0,3]
],
return [ [0,2], [0,0] ]

Challenge
Did you use extra space? A straight forward solution using O(mn) space is probably a bad idea. A simple improvement uses O(m + n) space, but still not the best solution. Could you devise a constant space solution?

Solution:

The first method used two boolean arrays to store the columns or rows should be set to zero.
To make it use constant space, the second method use the first row and column to store the information that rows and columns should be set to zero.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
/*************************************************************************
> File Name: SetMatrixZeros.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Sun 15 Nov 16:11:35 2015
************************************************************************/


public class SetMatrixZeros{
// Solution 1, time complexity O(m * n), space complexity O(m + n)
public void setZeroes(int[][] matrix) {
int rowLength = matrix.length;
int colLength = matrix[0].length;

boolean[] zeroRows = new boolean[rowLength];
boolean[] zeroCols = new boolean[colLength];

for (int row = 0; row < rowLength; row++){
for (int col = 0; col < colLength; col++){
if (matrix[row][col] == 0){
zeroRows[row] = true;
zeroCols[col] = true;
}
}
}

for (int row = 0; row < zeroRows.length; row++){
if (zeroRows[row]){
for (int col = 0; col< colLength; col++)
matrix[row][col] = 0;
}
}

for (int col = 0; col < zeroCols.length; col++){
if (zeroCols[col]){
for (int row = 0; row < rowLength; row++)
matrix[row][col] = 0;
}
}
}

// Solution 2, time complexity O(m * n), space complexity O(1)
public void setZeroes(int[][] matrix) {
int rowLength = matrix.length;
int colLength = matrix[0].length;

boolean firstRowZero = false;
boolean firstColZero = false;

// Check whether the first row contains zero
for (int col = 0; col < colLength; col ++){
if (matrix[0][col] == 0){
firstRowZero = true;
break;
}
}

// Check whether the first column contains zero
for (int row = 0; row < rowLength; row++){
if (matrix[row][0] == 0){
firstColZero = true;
break;
}
}

// set the element in the first row or column to
// be zero if the corresponding column or row contains zero
for (int row = 1; row < rowLength; row++){
for (int col = 1; col < colLength; col ++){
if (matrix[row][col] == 0){
matrix[0][col] = 0;
matrix[row][0] = 0;
}
}
}

// Set corresponding rows to be zeors;
for (int row = 1; row < rowLength; row++){
if (matrix[row][0] == 0){
for (int col = 1; col < colLength; col ++){
matrix[row][col] = 0;
}
}
}

// Set corresponding cols to be zeros
for (int col = 1; col < colLength; col++){
if (matrix[0][col] == 0){
for (int row = 1; row < rowLength; row ++){
matrix[row][col] = 0;
}
}
}

// Process the first row
if (firstRowZero){
for (int col = 0; col < colLength; col ++){
matrix[0][col] = 0;
}
}

// Process the first column
if (firstColZero){
for (int row = 0; row < rowLength; row++){
matrix[row][0] = 0;
}
}
}
}

Gray Code

Problem:

The gray code is a binary numeral system where two successive values differ in only one bit.

Given a non-negative integer n representing the total number of bits in the code, find the sequence of gray code. A gray code sequence must begin with 0 and with cover all 2n integers.

Have you met this question in a real interview? Yes
Example
Given n = 2, return [0,1,3,2]. Its gray code sequence is:

00 - 0
01 - 1
11 - 3
10 - 2

Note
For a given n, a gray code sequence is not uniquely defined.

[0,2,3,1] is also a valid gray code sequence according to the above definition.

Challenge
O(2n) time.

Leetcode link
Lintcode link

Solution:

Use recursion to generate a new gray code from the (n-1) as the picture shown in the Figure below.




/*************************************************************************
    > File Name: GrayCode.java
    > Author: Yao Zhang 
    > Mail: psyyz10@163.com 
    > Created Time: Fri 13 Nov 17:22:56 2015
 ************************************************************************/

public class GrayCode{
    List<Integer> grayCode(int n){
        ArrayList<Integer> result = new ArrayList<Integer>();
        result.add(0);
        if (n == 0)
            return result;
        if (n == 1){
            result.add(1);
            return result;
        }
        result = (ArrayList<Integer>) grayCode(n - 1);
        ArrayList<Integer> temp = reverse(result);
        for (int i = 0; i < temp.size(); i++){
            temp.set(i,(1 << (n - 1)) + temp.get(i));
        }
        result.addAll(temp);

        return result;
    }

    public ArrayList<Integer> reverse(ArrayList<Integer> result){
        ArrayList<Integer> newResult = new ArrayList<Integer>();
        for (int i = result.size() - 1; i >= 0; i--){
            newResult.add(result.get(i));
        }
        return newResult;
    }
}

Climbing Stairs

Problem:

You are climbing a stair case. It takes n steps to reach to the top.

Each time you can either climb 1 or 2 steps. In how many distinct ways can you climb to the top?

Solution:

The idea behind this problem is the same as Fibonacci which is f(n) = f(n-1) + f(n-2).
There are two solutions making the time complexity as O(N).

  1. The first one is an iteration. Time complexity O(n), space complexity O(1).
  2. The second one is recurstive version. But the normal recurtion is very expensive, roughly O(2^n).
    So we used dynamic programming by take a buffer stores the value has gone.
    The time complexity is O(n), space complexity is O(n) too.

There are two solutions.
/*

> File Name: ClimbingStairs.java
> Author: Yao Zhang 
> Mail: psyyz10@163.com 
> Created Time: Fri 13 Nov 16:12:24 2015

**/

public class ClimbingStairs{
// Solution 1
public int climbStairs(int n){
if (n <= 1)
return n;

   int prev = 1;
   int last = 1;
   int current = 0;
   for (int i = 2; i <= n; i++){
       current = prev + last;
       last = prev;
       prev = current;
   }

   return current;
}

//Solution2
public int climbStairs(int n){
    if (n < 0) 
        return 0;             

    int[] buffer = new int[n];

    return climbStairsHelper(n, buffer);
}              
public int climbStairsHelper(int n, int[] buffer){
    if (n < 0){
        return 0;
    }else if (n == 0){
        return 1;
    } else if (buffer[n - 1] != 0){                                                                                  
        return buffer[n - 1];        
    } else {   
        buffer[n - 1] = climbStairsHelper(n - 1, buffer) + climbStairsHelper(n - 2, buffer);
        return buffer[n - 1];     
    }          
}   

}

Plus One

Problem:

Given a non-negative number represented as an array of digits, plus one to the number.

The digits are stored such that the most significant digit is at the head of the list.

Example
Given [1,2,3] which represents 123, return [1,2,4].

Given [9,9,9] which represents 999, return [1,0,0,0].

Leetcode link
Lintcode link

Solution:

It is a essay problem, just add carry though the list, until the carry equals 0.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/*************************************************************************
> File Name: PlusOne.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 13 Nov 14:27:19 2015
************************************************************************/


public class PlusOne{
public int[] plusOne(int[] digits){
int carry = 1;
for (int i = digits.length - 1; i >= 0 && carry > 0; i--){
int sum = digits[i] + carry;
carry = sum / 10;
digits[i] = sum % 10;
}

if (carry == 0)
return digits;

int[] newDigits = new int[digits.length + 1];
newDigits[0] = carry;

for (int i = 1; i < newDigits.length; i++){
newDigits[i] = digits[i - 1];
}

return newDigits;
}
}

Rotate Image

Problem:

You are given an n x n 2D matrix representing an image. Rotate the image by 90 degrees (clockwise).

Example
Given a matrix

[
    [1,2],
    [3,4]
]

rotate it by 90 degrees (clockwise), return

[
    [3,1],
    [4,2]
]

Leetcode link
Lintcode link

Solution:

There are two solutions.
The idea of the first method is to divide the matrix into four smaller matrix, then move each sub-matrix to the rotated position
The idea of the second method is to rotate layer by layer, from the outmost layer to the inner most layer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*************************************************************************
> File Name: RotatedImage.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 13 Nov 13:56:37 2015
************************************************************************/


public class RotatedImage{
//Solution 1
public void rotate(int[][] matrix) {
int length = matrix.length;

for (int i = 0; i < length / 2; i++){
for (int j = 0; j < (length + 1) / 2; j++){
int temp = matrix[i][j];
matrix[i][j] = matrix[length - 1 - j][i];
matrix[length - 1 - j][i] = matrix[length - 1 - i][length - 1 - j];
matrix[length - 1 - i][length - 1 - j] = matrix[j][length - 1 - i];
matrix[j][length - 1 - i] = temp;
}
}
}

//Solution 2
public void rotate(int[][] matrix) {
int length = matrix.length;

for (int layer = 0; layer < length / 2; layer++){
int last = length - 1 - layer;
for (int j = layer; j < last; j++){
int offset = j - layer;
int temp = matrix[layer][j];
matrix[layer][j] = matrix[last - offset][layer];
matrix[last - offset][layer] = matrix[last][last - offset];
matrix[last][last - offset] = matrix[j][last];
matrix[j][last] = temp;
}
}
}
}

Trapping Rain Water

Problem:

Given n non-negative integers representing an elevation map where the width of each bar is 1, compute how much water it is able to trap after raining.

For example, 
Given [0,1,0,2,1,0,1,3,2,1,2,1], return 6.


The above elevation map is represented by array [0,1,0,2,1,0,1,3,2,1,2,1]. In this case, 6 units of rain water (blue section) are being trapped. Thanks Marcos for contributing this image!

Leetcode link
Lintcode link

My solution:

For each bar, find the highest bars of the left side and right side respectively.
The trapping water for the bar is min(leftMax,rightmax) - its height.

There are two solutions.

    • Find the highest bar
    • Process the left and right of the highest bar respectively
    • Go through the bars from left to right, calculate the left highest bar for each bar
    • Go through the bars form right to left, calculate the left highest bar for each bar
    • Calculate the min(leftMax,rightmax) - its height for each bar.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
/*************************************************************************
> File Name: TrappingWater.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Fri 13 Nov 10:45:30 2015
************************************************************************/


public class TrappingWater{
// Solution 1
public int trap(int[] height){
int area = 0;
int maxIndex = 0;

// Find the index of the highest bar
for (int i = 0; i < height.length; i ++){
if (height[i] > height[maxIndex])
maxIndex = i;
}

// Process the left side of the highest bar
int leftMax = 0;
for (int i = 0; i < maxIndex; i++){
if (height[i] > leftMax)
leftMax = height[i];
else
area += leftMax - height[i];
}

// Process the rigtht side of the highest bar
int rightMax = 0;
for (int i = height.length - 1; i > maxIndex; i--){
if (height[i] > rightMax)
rightMax = height[i];
else
area += rightMax - height[i];
}

return area;
}

// Solution 2
public int trap2(int[] height){
int area = 0;

if (height.length == 0)
return area;

// find the left highest bar value for each bar
int[] leftMax = new int[height.length];
leftMax[0] = 0;
for (int i = 1; i < height.length; i++){
leftMax[i] = Math.max(height[i-1], leftMax[i-1]) ;
}

// for each bar, calculate the trapping water by
// calculating min(leftMax,rightmax) - height
int max = 0;
for (int i = height.length - 1; i > 0; i--){
int diff = Math.min(max,leftMax[i]) - height[i];
area += diff > 0 ? diff : 0;
max = Math.max(height[i], max);
}

return area;
}
}

Valid Sudoku

Problem:

Determine if a Sudoku is valid, according to: Sudoku Puzzles - The Rules.

The Sudoku board could be partially filled, where empty cells are filled with the character ‘.’.


A partially filled sudoku which is valid.

Note:
A valid Sudoku board (partially filled) is not necessarily solvable. Only the filled cells need to be validated.

Leetcode link
Lintcode link

My Solution:

This question is straight, no tricky ideas needed.
But it reviews the capacity of detailed implementation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
/*************************************************************************
> File Name: ValidateSudoku.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 12 Nov 22:54:54 2015
************************************************************************/


public class ValidateSudoku{
public boolean isValidSudoku(char[][] board){
boolean[] visited = new boolean[9];

for (int col = 0; col < 9; col++){
Arrays.fill(visited,false);

for (int row = 0; row < 9; row++){
char current = board[col][row];
if (!process(visited,current)){
return false;
}
}
}


for (int row = 0; row < 9; row++){
Arrays.fill(visited,false);

for (int col = 0; col < 9; col++){
char current = board[col][row];
if (!process(visited,current)){
return false;
}
}
}

for (int i = 0; i < 9; i += 3){
for (int j = 0; j < 9; j += 3){
Arrays.fill(visited, false);

for (int k = 0; k < 9; k++){
// note use mod and division to track a small matrix
char current = board[i + k % 3][j + k / 3];
if (!process(visited, current)){
return false;
}
}
}
}

return true;

}

public boolean process(boolean[] visited, char current){
if (current == '.')
return true;

int index = current - '1';
if (visited[index])
return false;

visited[index] = true;
return true;
}
}

Related Problems:

Permutation Sequence

Problem:

Given n and k, return the k-th permutation sequence.

Example
For n = 3, all permutations are listed as follows:

"123"
"132"
"213"
"231"
"312"
"321"
If k = 4, the fourth permutation is "231"

Note
n will be between 1 and 9 inclusive.

Challenge
O(n*k) in time complexity is easy, can you do it in O(n^2) or less?

Leetcode link
Lintcode link

My Solution:

The most intuitive way is calling nextPermuation method for k times. It takes O(k * n);
Assume there are n non-repeated numbers,whose k-th permuation is a1,a2,a3, …, an.
For each increassing of a1, there are (n-1)! possibilities for a2 to an. So a1 = k / (n-1)!;
Similarly we can get the following as following:

k_2 = k % (n-1)!;
a_2 = k_2 / (n-2)!;
   ...
k_{n-1} = k_{n-2} / 2!;
a_{n-1} = k-{n-1} / 1!;
a_n = 1;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*************************************************************************
> File Name: PermuationSequence.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 12 Nov 15:14:13 2015
************************************************************************/


public class PermuationSequence{
public String getPermutation(int n, int k) {
StringBuilder builder = new StringBuilder();
int factor = 1;
for (int i = 1; i < n; i++){
factor *= i;
}

boolean[] used = new boolean[n];
k = k - 1; // note k should minus 1
for (int i = 0; i < n; i++){
int index = k / factor;
k = k % factor;

for (int j = 0; j < n; j++){
if (!used[j]){
if (index == 0){
builder.append(String.valueOf(j + 1)) ;
used[j] = true;
break;
} else{
index--;
}
}
}

if (n - 1 != i)
factor /= (n - 1 - i);
}

return builder.toString();
}
}

Next Permutation

Problem:

Implement next permutation, which rearranges numbers into the lexicographically next greater permutation of numbers.

If such arrangement is not possible, it must rearrange it as the lowest possible order (ie, sorted in ascending order).

The replacement must be in-place, do not allocate extra memory.

Here are some examples. Inputs are in the left-hand column and its corresponding outputs are in the right-hand column.
1,2,31,3,2
3,2,11,2,3
1,1,51,5,1

Leetcode link
Lintcode link

Solution:




Image from http://fisherlei.blogspot.com/2012/12/leetcode-next-permutation.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/*************************************************************************
> File Name: NextPermutation.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 12 Nov 12:53:41 2015
************************************************************************/


public class NextPermutation{
public void nextPermutation(int[] nums) {
int partitionIndex = -1;
for (int i = nums.length - 2; i >= 0; i--){
if (nums[i] < nums[i + 1]) {
partitionIndex = i;
break;
}
}

if (partitionIndex != -1){
int changeIndex = 0;
for (int i = nums.length - 1; i >= 0; i--){
if (nums[partitionIndex] < nums[i]){
changeIndex = i;
break;
}
}
swap(nums,partitionIndex,changeIndex);
}

int start = partitionIndex + 1;
int end = nums.length - 1;

reverse(nums, start, end);
}

public void reverse(int[] array, int start, int end){
for (int i = start, j = end; i < j; i++, j--)
swap(array, i, j);
}

public void swap(int[] array, int index1, int index2){
int temp = array[index1];
array[index1] = array[index2];
array[index2] = temp;
}
}

Related Problems:

Remove Element

Problem:

Given an array and a value, remove all instances of that value in place and return the new length.

The order of elements can be changed. It doesn’t matter what you leave beyond the new length.

Subscribe to see which companies asked this question
Leetcode link
Lintcode link

Solution:

Use a ponter to point to the end index of the array. Go through the array, if the current element equals to the target value, swap the current value with the value at the end index, and then move the end pointer a unit towards the start direction. Then repeat.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/*************************************************************************
> File Name: RemoveElement.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 12 Nov 12:15:44 2015
************************************************************************/


public class RemoveElement{
public int removeElement(int[] nums, int val) {
int endIndex = nums.length - 1;

int index = 0;
while (index <= endIndex){
if (nums[index] == val){
int temp = nums[index];
nums[index] = nums[endIndex];
nums[endIndex] = temp;
endIndex--;
} else {
index++;
}
}

return endIndex + 1;
}
}

4Sum

Problem:

Given an array S of n integers, are there elements a, b, c, and d in S such that a + b + c + d = target? Find all unique quadruplets in the array which gives the sum of target.

Note:
Elements in a quadruplet (a,b,c,d) must be in non-descending order. (ie, a ≤ b ≤ c ≤ d)
The solution set must not contain duplicate quadruplets.
For example, given array S = {1 0 -1 0 -2 2}, and target = 0.

A solution set is:
(-1,  0, 0, 1)
(-2, -1, 1, 2)
(-2,  0, 0, 2)

Leetcode link
Lintcode link

My solution:

The solution is nearly the same as 3sum

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
/*************************************************************************
> File Name: FourSum.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 11 Nov 23:48:43 2015
************************************************************************/


public class FourSum{
public ArrayList<ArrayList<Integer>> fourSum(int[] nums, int target) {
//write your code here
ArrayList<ArrayList<Integer>> result = new ArrayList<ArrayList<Integer>>();
if (nums.length < 4) return result;
Arrays.sort(nums);

// note i < nums.length - 3
for (int i = 0; i < nums.length - 3; i ++){
if (i != 0 && nums[i - 1] == nums[i])
continue;

// note j start from i + 1, but not i
for (int j = i + 1; j < nums.length - 2; j++){
if (j != i + 1 && nums[j - 1] == nums[j])
continue;

int start = j + 1;
int end = nums.length - 1;

while (start < end){
int sum = nums[i] + nums[j] + nums[start] + nums[end];
if (sum == target){
ArrayList<Integer> list = new ArrayList<Integer>();
list.add(nums[i]);
list.add(nums[j]);
list.add(nums[start]);
list.add(nums[end]);
result.add(list);
start++;
end--;

// note condition start < end
while (nums[start - 1] == nums[start] && start < end)
start++;

while (nums[end + 1] == nums[end] && start < end)
end--;
} else if (sum < target){
start++;
} else {
end--;
}
}
}
}

return result;
}
}

Related Problem:
Two Sum
3Sum
3Sum Closest

3Sum Closest

Problem:

Given an array S of n integers, find three integers in S such that the sum is closest to a given number, target. Return the sum of the three integers. You may assume that each input would have exactly one solution.

For example, given array S = {-1 2 1 -4}, and target = 1.

The sum that is closest to the target is 2. (-1 + 2 + 1 = 2).

Leetcode link
Lintcode link

Solution:

Sort the list first, which takes O(Nlog N). Then go through the sorted list, for each integer “nums[i]” in the list,
narrow the range from the integers at index i and the end of the list. It takes O(n^2)

Use a ‘min’ variable to record the minimum difference, and a ‘result’ variable to record the sum with the minimum difference.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/*************************************************************************
> File Name: ThreeSumClosest.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 11 Nov 23:20:11 2015
************************************************************************/


public class ThreeSumClosest{
public int threeSumClosest(int[] nums, int target) {
int min = Integer.MAX_VALUE;
int result = 0;

Arrays.sort(nums);

for (int i = 0; i < nums.length; i++){
int start = i + 1;
int end = nums.length - 1;

while (start < end){
int sum = nums[i] + nums[start] + nums[end];
int diff = Math.abs(sum - target);

if (diff < min){
min = diff;
result = sum;
}

if (sum == target)
return sum;
else if (sum < target)
start++;
else
end--;
}
}

return result;
}
}

3Sum

Problem:

Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.

Note:
Elements in a triplet (a,b,c) must be in non-descending order. (ie, a ≤ b ≤ c)
The solution set must not contain duplicate triplets.
For example, given array S = {-1 0 1 2 -1 -4},

A solution set is:
(-1, 0, 1)
(-1, -1, 2)

Leetcode link
Lintcode link

My Solution:

Sort the list first, which takes O(Nlog N). Then go through the sorted list, for each integer “nums[i]” in the list,
narrow the range from the integers at index i and the end of the list. It takes O(n^2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
/*************************************************************************
> File Name: ThreeSum.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 11 Nov 18:17:20 2015
************************************************************************/


public class ThreeSum{
public List<List<Integer>> threeSum(int[] nums){
List<List<Integer>> result = new ArrayList<List<Integer>>();

if (nums.length < 3)
return result;

Arrays.sort(nums);

for (int i = 0; i < nums.length; i++){
if (i != 0 && nums[i] == nums[i - 1])
continue;

int start = i + 1;
int end = nums.length - 1;

while (start < end){
if (nums[i] + nums[start] + nums[end] == 0){
List<Integer> list = new ArrayList<Integer>();
list.add(nums[i]);
list.add(nums[start]);
list.add(nums[end]);
result.add(list);

start++;
end--;

while (nums[start - 1] == nums[start] && start < end)
start++;

while(nums[end + 1] == nums[end] && start < end)
end--;
} else if (nums[i] + nums[start] + nums[end] < 0){
start++;
} else{
end--;
}
}
}
return result;
}
}

Related Problem:
Two Sum
4Sum
3Sum Closest

Two Sum

Problem:

Given an array of integers, find two numbers such that they add up to a specific target number.

The function twoSum should return indices of the two numbers such that they add up to the target, where index1 must be less than index2. Please note that your returned answers (both index1 and index2) are not zero-based.

You may assume that each input would have exactly one solution.

Input: numbers={2, 7, 11, 15}, target=9
Output: index1=1, index2=2

Leetcode
Lintcode

Solution

We can use a hash map to record the occurrence and index of integers.

When we go through the list, the key (target - current integer) exits in the hash map, then we can return current index and index corresponding to (target - current integer).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*************************************************************************
> File Name: TwoSum.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Wed 11 Nov 17:49:20 2015
************************************************************************/


public class TwoSum{
public int[] twoSum(int[] nums, int target) {
HashMap<Integer,Integer> map = new HashMap<Integer, Integer>();

for (int i = 0; i < nums.length ; i++){
if (map.containsKey(nums[i])) continue;

if (map.containsKey(target - nums[i])){
int[] result = {map.get(target - nums[i]) + 1, i + 1};
return result;
}

map.put(nums[i],i);
}

return null;
}
}

Related Problem:
3Sum
4Sum
3Sum Closest

Median of Two Sorted Arrays

Problem:

There are two sorted arrays nums1 and nums2 of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

Example
Given A=[1,2,3,4,5,6] and B=[2,3,4,5], the median is 3.5.

Given A=[1,2,3] and B=[4,5], the median is 3.

Challenge
The overall run time complexity should be O(log (m+n)).
Leetcode link
Lintcode link

My Solution:

This is a classic problem, which can be viewed as another form: “Given two sorted array, how to find the k-th largest number among both arrays?”.

The k in this problem is “total length / 2” for odd number. If the total length is even, we need to find “total length / 2” and “total length / 2 + 1” number, and then get the average value.

The most intuitive solution is to merge them and find the k-th number, which takes O(M + N);

There is a better solution. Assume both amount of elements in A and B are greater than k / 2. If we compare the (k / 2)-th value in A (A[k / 2 - 1]) and the (k / 2)-th value in B (B[k / 2 - 1]), we have three possilbe cases:

  • If A[k / 2 - 1] == B[k / 2 - 1], we can say the value is the k-th value we want.
  • If A[k / 2 - 1] < B[k / 2 - 1], it means integer from A[0] to A[k / 2 - 1] should be smaller than the k-th value, so we can drop these values.
  • If A[k / 2 - 1] > B[k / 2 - 1], it means integer from B[0] to B[k / 2 - 1] should be smaller than the k-th value, so we can drop these values.

To drop the values in A and B, we can use two pointers to point to the start index of A and B respectively.
We can build a recursion whose stop points are:

  • If the start index pointer of A or B exceed the length, return B[BStart + k - 1] or A[AStart + k - 1];
  • If k == 1, return the minimum of A[AStart] or B[BStart]
  • If A[AStart + k / 2 -1] == B[BStart + k / 2 -1], return the value.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/*************************************************************************
> File Name: MedianOfTwoSortedArray.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Tue 10 Nov 22:01:06 2015
************************************************************************/


public class MedianOfTwoSortedArray{
public double findMedianSortedArrays(int[] A, int[] B) {
int len = A.length + B.length;
if (len % 2 != 0)
return findKth(A,0,B,0,len / 2 + 1);
else
return (findKth(A,0,B,0,len / 2) + findKth(A,0,B,0,len / 2 + 1)) / 2.0;
}

public static int findKth(int[] A, int AStart,
int[] B, int BStart, int k)
{

if (AStart >= A.length)
return B[BStart + k - 1] ;
if (BStart >= B.length)
return A[AStart + k - 1] ;
if (k == 1)
return Math.min(A[AStart],B[BStart]);

int va = AStart + k / 2 <= A.length ?
A[AStart + k / 2 -1] : Integer.MAX_VALUE;
int vb = BStart + k / 2 <= B.length ?
B[BStart + k / 2 -1] : Integer.MAX_VALUE;

if (va < vb)
return findKth(A, AStart + k / 2, B, BStart, k - k / 2);
else if (va > vb)
return findKth(A, AStart, B, BStart + k / 2, k - k / 2);
else
return va;
}
}

Search in Rotated Sorted Array II

Problem:

Follow up for “Search in Rotated Sorted Array”:
What if duplicates are allowed?

Would this affect the run-time complexity? How and why?

Write a function to determine if a given target is in the array.

Leetcode link
Lintcode link

My Solution:

Compared to the “Search in Rotated Sorted Array”, the elements can be duplicate, so we can not know the start point is on the left part or right part, even if we have “nums[left] <= nums[mid]”.

We can only judge when “nums[left] < nums[mid]”. So if “nums[left] == nums[mid]”, we can add left lie “left++” until “nums[left]” and “nums[right]” are different. Code implemented in ‘search’ method.

The time complexity is O(n).

Solution 1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*************************************************************************
> File Name: SearchInRotatedArrayII.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Tue 10 Nov 18:16:29 2015
************************************************************************/


public class SearchInRotatedArrayII{
public boolean search(int[] nums, int target) {
if (nums == null || nums.length == 0)
return false;

int left = 0;
int right = nums.length - 1;

while (left != right){
int mid = left + (right - left) / 2;
if (nums[mid] == target)
return true;

if (nums[mid] > nums[left]){
if (target >= nums[left] && target < nums[mid])
right = mid - 1;
else
left = mid + 1;
} else if (nums[mid] < nums[left]) {
if (target > nums[mid] && target <= nums[right])
left = mid + 1;
else
right = mid - 1;
} else {
left++;
}
}

if (nums[left] == target) return true;

return false;
}
}

Solution 1 is better to return an index. If we just need to return boolean, a brute force method is ok, like Solution 2.

1
2
3
4
5
6
public boolean search2(int[] nums, int target) {
for (int n : nums){
if (n == target) return true;
}
return false;
}

Related Problem:
Search in Rotated Sorted Array

Search in Rotated Sorted Array

Problem:

Suppose a sorted array is rotated at some pivot unknown to you beforehand.

(i.e., 0 1 2 4 5 6 7 might become 4 5 6 7 0 1 2).

You are given a target value to search. If found in the array return its index, otherwise return -1.

You may assume no duplicate exists in the array.

Example
For [4, 5, 1, 2, 3] and target=1, return 2.

For [4, 5, 1, 2, 3] and target=0, return -1.

Challenge
O(logN) time

lintcode link
Leetcode link

My Solution:

Use binary search to find the match.
There are some notable tips:

  • Check the start point first (at the left or right side of mid), then assume the target on the consecutive segment.
  • While loop condition: left != right
  • After the while loop, we should add “if (nums[left] == target) return left;”
  • The if condition “nums[mid] >= nums[left]” should contain the equal sign

The time complexity is O(log n).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
 /*************************************************************************                                                                                                          
> File Name: SearchInRotatedArray.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 29 Oct 15:26:17 2015
************************************************************************/


public class SearchInRotatedArray{
if (nums == null || nums.length == 0)
return -1;

int left = 0;
int right = nums.length - 1;

while (left != right){
int mid = left + (right - left) / 2;
if (nums[mid] == target)
return mid;

if (nums[mid] >= nums[left]){
if (target >= nums[left] && target < nums[mid])
right = mid - 1;
else
left = mid + 1;
} else {
if (target > nums[mid] && target <= nums[right])
left = mid + 1;
else
right = mid - 1;
}
}

if (nums[left] == target) return left;

return -1;
}
}

Related Problem:
Search in Rotated Sorted Array II

Remove Duplicates from Sorted Array II

Problem:

Follow up for “Remove Duplicates”:
What if duplicates are allowed at most twice?

For example,
Given sorted array nums = [1,1,1,2,2,3],

Your function should return length = 5, with the first five elements of nums being 1, 1, 2, 2 and 3. It doesn’t matter what you leave beyond the new length.

Leetcode link
lintcode link

Solution:

Compared to Remove Duplicates from Sorted Array, We onlt need to add an another variable to record the number of times that the same elements occur, named as ‘same’;
This is a sorted map, so we just need to one variable to solve the problem, if not sorted, we need a Hashmap to record the occurrence of each elements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/*************************************************************************
> File Name: RemoveDuplicate2.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 29 Oct 15:26:17 2015
************************************************************************/


public class RemoveDuplicate2{
public int removeDuplicates(int[] nums) {
// write your code here
if (nums == null || nums.length == 0)
return 0;

int last = 0;
int same = 0;
for (int i = 1; i < nums.length; i++){
if (nums[last] == nums[i])
same++;
else
same = 0;

if (same < 2)
nums[++last] = nums[i];
}

return ++last;
}
}

Remove Duplicates from Sorted Array

Remove Duplicates from Sorted Array

Problem:

Given a sorted array, remove the duplicates in place such that each element appear only once and return the new length.

Do not allocate extra space for another array, you must do this in place with constant memory.

Example
Given input array A = [1,1,2],

Your function should return length = 2, and A is now [1,2].

Tags Expand

Related Problems Expand

Leetcode link
lintcode link

My Solution:

In this question, a variable named ‘last’ is used to record the index of last element in the result array.
Go through the array, if the current element nums[i] is not equal to the last element of result array ‘nums[last]’,
add the current element to end of result array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/*************************************************************************
> File Name: RemoveDuplicate2.java
> Author: Yao Zhang
> Mail: psyyz10@163.com
> Created Time: Thu 29 Oct 15:26:17 2015
************************************************************************/


public class RemoveDuplicate {
public int removeDuplicates(int[] nums) {
if (nums == null || nums.length == 0)
return 0;

int last = 0;

for (int i = 1; i < nums.length; i++){
if (nums[last] != nums[i]){
nums[++last] = nums[i];
}
}

return ++last;
}
}

Time complexity is O(n), space complexity is O(1)

Remove Duplicates from Sorted Array II

Deep Learning Overview

Nowadays, the performances of machine learning models heavily rely on the representation of data or feature selection steps rather than just the choice of machine learning algorithms. Thus much effort is applied on preprocessing pipelines such as feature selection. Even though some specific domain knowledge can be used to help design the representation of data, the motivation of Artificial Intelligence needs more powerful representations of features. Deep Learning also called unsupervised learning, is a relatively new research field of machine learning which can learn multi- ple levels of abstraction and representation of features directly from data. It aims at learning feature hierarchies with higher level features formed by the composition of lower ones. The multiple levels structures allow to build complex functions which take data as input and output the result directly without depending on features crafted by humans [Bengio09].

Deep learning achieved many successful results on some problems, such as im- age classification [Ciresan12] [Krizhevsky12], semantic parsing [Bordes12] and speech recognition [Dahl12]. Deep architecture may express the complex distributions more efficiently with better per- formance on challenging tasks [Bengio07][Bengio09]. The hypothesis that the composition of addi- tional functional levels can give more powerful modeling capacity has already been proposed for a long time [Hinton90][Rumelhart86]. However, the training process of deep architec- ture was proven to be very difficult, until some successful approaches of [Bengio007][Hinton06][Hinton006] for training stacked autoencoder and DBN occurred. One key idea behind them is to train the deep architecture layer by layer by unsupervised learning, which is also called unsupervised feature learning. Each layer will generate a more abstract representation of the observed layer by doing a local optimization. Unsupervised feature learning can learn useful features automatically and directly from the data set by unsupervised learning without given specific features defined by human. The unsupervised learned features are more natural with less information lost [Bengio15]. Some deep learning models also have a potential powerful capacity of solving time-series problems [La14], which is another reason that makes deep learning suitable for stock trend prediction. Therefore deep learning can provide a new potential and powerful approach to improve stock prediction.

Building Deep Representations

Experimental results show that it is much harder to train a deep architecture than training a shallow one [Bengio07][Erhan09]. The rapid recent growth and success of deep learning owe to a breakthrough initiated by Geoff Hinton in 2006 and quickly followed up by some papers [Hinton006][Bengio94][Poultney06]. Greedy layer wise unsupervised pre-training as a central idea was proposed. It means just one layer of the hierarchy is trained at one time by unsupervised feature learning to learn a new transformation of data with the previous layer output. Finally, the set of pre-trained layers is combined with a stan- dard supervised classifier such as Support Vector Machine, Logistic Regression or as initialization for a deep learning model such as a Stacked Denoising Auto-encoder or Deep Belief Network. Experiments show that the layer-wise stacking can attain a better feature representation in most time [Larochelle09]. Although it is not difficult to combine single layers pretrained by unsupervised learning into a supervised model, it is not very clear how the single layers should combine to form a better unsuper- vised model [Bengio15]. One approach is to stack pre-trained RBMs (section 4.2.1) into DBN (section 4.2.2).

Stacked Denoising Autoencoders


[Bengio09] Yoshua Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127, 2009. Also published as a book. Now Publishers, 2009.

[Ciresan12] Dan Ciresan, Ueli Meier, and Ju ̈rgen Schmidhuber. Multi-column deep neural networks for image classification. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 3642–3649. IEEE, 2012.

[Krizhevsky12] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.

[Bordes12] Antoine Bordes, Xavier Glorot, Jason Weston, and Yoshua Bengio. Joint learn- ing of words and meaning representations for open-text semantic parsing. In International Conference on Artificial Intelligence and Statistics, pages 127– 135, 2012.

[Dahl12] George E Dahl, Dong Yu, Li Deng, and Alex Acero. Context-dependent pre- trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1):30–42, 2012.

[Bengio07] Yoshua Bengio, Yann LeCun, et al. Scaling learning algorithms towards ai. Large-scale kernel machines, 34(5), 2007.

[Hinton90] Geoffrey E Hinton. Connectionist learning procedures. artificial intelligence, 40 1-3: 185 234, 1989. reprinted in j. carbonell, editor,”. Machine Learning: Paradigms and Methods”, MIT Press, 1990.

[Rumelhart86] David E Rumelhart and James L McClelland. The pdp research group: Par- allel distributed processing: Explorations in the microstructure of cognition. Foundations, 1, 1986.

[Bengio007] Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle, et al. Greedy layer-wise training of deep networks. Advances in neural information processing systems, 19:153, 2007.

[Hinton06] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.

[Hinton006] Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algo- rithm for deep belief nets. Neural computation, 18(7):1527–1554, 2006.

[Erhan09] Dumitru Erhan, Pierre-Antoine Manzagol, Yoshua Bengio, Samy Bengio, and Pascal Vincent. The difficulty of training deep architectures and the effect of unsupervised pre-training. In International Conference on artificial intelligence and statistics, pages 153–160, 2009.

[Bengio15] Yoshua Bengio, Ian J. Goodfellow, and Aaron Courville. Deep learning. Book in preparation for MIT Press, 2015.

[La14] Martin La ̈ngkvist, Lars Karlsson, and Amy Loutfi. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogni- tion Letters, 42:11–24, 2014.

[Bengio94] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term depen- dencies with gradient descent is difficult. Neural Networks, IEEE Transactions on, 5(2):157–166, 1994.

[Poultney06] Christopher Poultney, Sumit Chopra, Yann L Cun, et al. Efficient learning of sparse representations with an energy-based model. In Advances in neural information processing systems, pages 1137–1144, 2006.

[Larochelle09] Hugo Larochelle, Yoshua Bengio, J ́eroˆme Louradour, and Pascal Lamblin. Ex- ploring strategies for training deep neural networks. The Journal of Machine Learning Research, 10:1–40, 2009.

[Bengio15] Yoshua Bengio, Ian J. Goodfellow, and Aaron Courville. Deep learning. Book in preparation for MIT Press, 2015.

Stacked Denoising Autoencoders

Autoencoders

An autoencoder [Bengio09] is a network whose graphical structure is shown in Figure 4.1, which has the same dimension for both input and output. It takes an unlabeled training examples in set where is a single input and encodes it to the hidden layer by linear combination with weight matrix and then through a non-linear activation function. It can be mathematically expressed as , where is the bias vector.



Figure 4.1: An Autoencoder


After that the hidden layer representation will be reconstructed to the output layer 􏰌 through a decoding function, in which 􏰌 has a same shape as . Hence the decoding function can be mathematically expressed as 􏰌, where can be called tried weights. In this project, tied weights were used. The aim of the model is to optimize the weight matrices, so that the reconstruction error between input and output can be minimized. It can be seen that the Autoencoder can be viewed as an unsupervised learning process of encoding-decoding: the encoder encodes the input through multi-layer encoder and then the decoder will decode it back with the lowest error [Hinton06].
To measure the reconstruction error, traditional squared error can be used. One of the most widely used way to measure that is the cross entropy if the input can be represented as bit vector or bit possibilities. The cross entropy error is shown in Equation 4.1:



Equation:4.1


The hidden layer code can capture the information of input examples along the main dimensions of variant coordinates via minimizing the reconstruction error. It is similar to the principle component analysis (PCA) which project data on the main component that captures the main information of the data. h can be viewed as a compression of input data with some lost, which hopefully not contain much information about the data. It is optimized to compress well the training data and have a small reconstruction error for the test data, but not for the data randomly chosen from input space.

Denoising Autoencoders

In order to prevent the Autoencoder from just learning the identity of the input and make the learnt representation more robust, it is better to reconstruct a corrupted version of the input. The Autoencoder with a corrupted version of input is called a Denoising Autoencoder. Its structure is shown in Figure 4.2. This method was proposed in [Vincent08], and it showed an advantage of corrupting the input by comparative experiments. Hence we will use denoising autoencoders instead of classic autoencoders in this thesis.



Figure 4.2: A graphical figure of Denoising Autoencoder. An input x is corrupted to . After that the autoencoder maps it to the hidden representation and attempts to reconstruct .


A Denoising Autoencoder can be seen as a stochastic version with adding a stochastic corruption process to Autoencoder. For the raw inputs , some of them will be randomly set to 0 as corrupted inputs . Next the corrupted input will be en- coded to the hidden code and then reconstructed to the ouput. But now 􏰌 is a deterministic function of rather than . As Autoencoder, the reconstruction is considered and calculated between 􏰌 and noted as . The parameters of the model are initialized randomly and then optimized by stochastic gradient descent algorithms. The difference is that the input of the encoding process is a corrupted version , hence it forces a much more clever mapping than just the identity, which can denoise and extract useful features in a noise condition.

The training algorithm of a denoising autoencoder is summarized in Algorithm 2.



Stacked Autoencoder

Unsupervised pre-training
A Stacked Autoencoder is a multi-layer neural network which consists of Autoencoders in each layer. Each layer’s input is from previous layer’s output. The greedy layer wise pre-training is an unsupervised approach that trains only one layer each time. Every layer is trained as a denoising autoencoder via minimising the cross entropy in reconstruction. Once the first layer has been trained, it can train the layer by using the previous layer’s hidden representation as input. An example is shown below. Figure 4.3 shows the first step of a stacked autoencoder. It trains an autoencoder on raw input to learn by minimizing the reconstruction error .



Figure 4.3: Step 1 in Stacked Autoencoders


Next step shown in Figure 4.4. The hidden representation would be as ”raw input” to train another autoencoder by minimizing the reconstruction error . Note that the error is calculated between previous latent feature representation and the output . Parameters and will be optimized by the gradient descent algorithm. The new hidden representation h2 will be the ’raw input’ of the next layer.



Figure 4.4: Step 2 in Stacked Autoencoders



The pre-training algorithm of stacked denoising autoencoder is summarized in algorithm 3.


Supervised fine-tuning

At last once all the layers has been pre-trained, the next step called fine-tuning is performed. A supervised predictor will be extended to the last layer, such as support vector machine or a logistic regression layer. In this project, we chose a logistic regression layer. After that the network will be trained. A sample graph is shown in Figure 4.5. It can be seen that for each layer of the network, only the encoding hidden representation are considered. The fine-tuning step will train the whole network by back-propagation like training an Artificial Neural Network. A stacked denoising autoencoder is just replace each layer’s autoencoder with denoising autoencoder whilst keeping other things the same.



Figure 4.5: A complete architecture of stacked autoencoder


The supervised fine-tuning algorithm of stacked denoising auto-encoder is summa- rized in Algorithm 4.





[Bengio09] Yoshua Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127, 2009. Also published as a book. Now Publishers, 2009.

[Hinton06] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006

[Vincent08] Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Man- zagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103. ACM, 2008.