An
Evaluation
Method
for
Chris
Department
Stemming
D.
Paice
of Computing,
Bailrigg,
Algorithms
Lancaster
Lancaster
LA1
University
4YR,
U.K.
Abstract The effectiveness
of stemming
algorithms
effect
performance
with
on retrieval
insights which
which stemming
might
help
in stemmer
performance
has
usually
been
test collections. optimisation.
is assessed against
This
in terms
measured
This however paper
predefine
of their
does not provide describes
concept
groups
any
a method
in
in samples
of
words. This enables various indices of stemming performance and weight to be computed. Results are reported for three stemming algorithms. The validity and usefulness of the approach, further
and the problems research
of conceptual
grouping,
are discussed,
and directions
for
are identified.
Introduction Stemming
is a widely-used
morphologically like
related
English,
a typical
method
terms, word
of word
standardisation
such as “clusters”
contains
designed
and “clustering”.
a stem which
refers
to allow
The idea
to some
central
the matching
is that, idea
of
in a language
or ‘meaning’,
and
that certain affixes have been added to modify the meaning and/or to fit the word for its syntactic role. The purpose of stemming is to strip away the affixes and thus reduce the word to its essence. In practice,
some affixes
may alter the meaning
of a word
so greatly
that to remove
them
would
be to discard vital information. In particular, deletion of prefixes is not generally felt to be helpful, except in certain domains such as medicine and chemistry. On the other hand, most suffixes in English are considered to be potentially removable. This paper is concerned with stemming in the restricted sense of suffix removal. A useful
summary
et al. [1]; further
Stemming There
of various
algorithms
stemming
and conflation
have been described
algorithms
by Frakes
is given
in the paper
by Lennon
[2] and by Paice [3].
Errors are two particular
problems
in using
stemming
for word
standardisation.
In the first place,
pairs of etymologically related words sometimes differ sharply in meaning - for example, consider “author” and “authoritarian”. In the second place, the transformations involved in adding and removing suffixes involve numerous irregularities and special cases. Stemming errors are of two kinds: wuierstemmirzg errors, in which words which refer to the same concept are not reduced to the same stem, and overstemmirzg errors, in which words are converted to the same stem even though they refer to distinct concepts. In designing a stemming algorithm there stemmer plays safe in order to avoid is a trade-off between these two kinds of error. A ligkt overstemming errors, but consequently leaves many understemming errors. A heavy sfewzwzer boldly removes all sorts of endings, some of which are decidedly unsafe, and therefore commits many oversternming There
errors.
have been several
investigations
into
the effects
of stemming
on retrieval
performance
in
test collections [1,2,4,5]. In most cases, stemming was found to improve retrieval performance, but not by very much, and there were no consistent differences of performance between different stemmers. Harman was unable to show any consistent benefit over not using stemming at all [6].
43 We
might
expect
performance is appropriate as a measure clearly
a relationship
in particular when high
of the weight,
with
or strength,
the performance
Although
evaluating
appropriate
from
into the specific
between
the
weight
of stemming
used
of each stemmer,
in precision-oriented
stemmers
it is probably
causes of errors.
Moreover,
but
found
that
and recall-oriented
on the basis of their
an IR viewpoint,
effects
unhelpful
it ignores
This paper
outlines
from
actual
well
texts. This permits
as indices
method
representing
involves
actual stemming It may
manually
humans Our
reasonable
on retrieval
in practice,
In using
with
types. into
evaluation
which during
a sample
correlate
may
it gives
seem
no insight
are not used only
in IR
or in a frame-instantiation
of word
weight’
into
not
is based on detecting
error
of words
did
performance
stemming
of a ‘stemming
and overstemming
words
be relied
obviously
a sample
retrieval
[1].
because
the fact that stemmers
committed
the computation
natural
here
index
and counting
samples
derived
for each stemmer,
rates and the general
conceptual
of words
into
entirely
to make objective
on an assumption
is that
‘reality’,
do not fall
upon
relies
assumption
base the evaluation
the
groups,
as
accuracy.
The
referring
the
groups,
and
and
to these groups.
that
agreement
errors
dividing
cannot
of stemming
approximate.
to stemmer
the under-
performance
be objected
that anyway practice
an approach
and overstemming
this
searches
systems - for example, they may be used in a natural language interface, program. IR-basect evaluations are irrelevant to such applications.
the ach~al under-
and
types of search; for instance, it might be supposed that heavy stemming compression recall is needed. Lennon et d. used the degree of dictionary
careful
of semantic
human
clear-cut
assignments.
judgment
semantic
In fact of course,
grouping, can
even
produce
taken from
a natural
on all the individual
word
source, we encounter
tokens
in the sample,
if it is somewhat
groups
and that these can be used for evaluation
the very are in
which
purposes.
the question
or on only
of whether
the distinct
In the first case, we would determine the frequencies of all the word types in computing the performance indices. This is perfectly straightforward,
to
word
and take these but it is found
account
that the results then obtained tend to be dominated by the way the stemmer handles a quite small number of high frequency word groups - for example, common verbs such as “be”/ “being” /“been”, etc. This is unfortunate since in practice common and irregular forms are often “do” /“doing’’/’’done” handled rare
by
and
lexical
lookup
anyway.
unpredictable
frequencies,
and
Stemmers In
occurrence.
uses
word
types
are
view
rather
mainly
of
than
this,
tokens.
required
our An
to deal
evaluation
incidental
with
words
method benefit
of relatively
ignores
is that
there
occurrence is no need
remove syntactic function words - nor to worry about what should be included in the stoplist it can be shown that inclusion or exclusion of these words has very little effect on the results.
Computation Suppose
we have
containing
forms
each concept
group
of pairs
This is given
Indices
a sample
which
of W
are both
different
semantically
stemmer
of different
should
words
merge
word
which
given
by
a perfect
of words stemmer
shoLkl
contains
these two
a 0.5 factor
member
defines
‘concept
related
of a concept
the ‘desired
and
= 0.5
groups’
each
to one another.
For
group
merge
total’
with
every
~M~g
other, for
that
ng(ng -1)
not merge any member
Thus,
for
DAITg By summing
into
in the group.
is not in the group.
w[crgc total G DMT
partitioned
morphologically
by
ng is the number
Secondly,
every
in the group
DMTg where
words, and
two totals may be computed.
since a perfect
Firstly,
the number group.
of Performance
to
- since
totals
over
the glohd
to compensate
every group
there
is a ‘desired
concept non-merge
group
with
total’
any
~~~g
= 0.5TZg(VV - IZg)
all groups
in the word
desired nownerge
for double
of the present
counting
total
sample,
G DNT
of pairs during
we obtain respectively.
the summation.
the @obal Each
desired
equation
44 After
applying
contain
a stemmer
two or more distinct
to the sample,
suppose
a concept
group
of size
instances
of these
stems
are
‘unachieved
merge
ng containss is given
distinct
this
quantity
understemming After
index
stemming,
groups.
over
we
also
The procedure
stem group overstemming
to find
a stem
group
and suppose
The number of WMTS, given
which
errors
errors
for the group
(the
G LIMT;
unachieved merge total
two
~~
or more
items
which
of representatives
different
concept
are derived
from
of these concept
for this stem group
is represented
groups
contains
f different
concept
groups
vl,
are
quantity
the overstemming It is clear a measure
over
index
that
the
all stem groups
01 is now given
for a heavy
situation
of
total’
will
will
be rather
ratio
we have
and overstemrning
01
and
values.
but worse And
no precise tendencies,
It will
in terms
01.
does the question
the difference does
make
word
sample
In order judged, letters
is quite about
we refer
and
~1
quantities
low, may
whereas
for
therefore
a light
be taken
as
for the relationship
occur
small
framework
that one stemmer
will
to judge
Regarding general
be better
whether
UI
in
accuracy
the understemming
to assess particular
than another
one is better
the second
and the difference
the relative
between in which
question
in terms another
of
is large,
LH,
overall?
we may observe
01
and/or
of the two
than
U1
that,
if
then it probably
stemmers,
at least
for the
consideration.
to obtain
(words
to account
even have meaning?
in weight under
theory
So is there any way
sense to talk
two
.
we lack any satisfactory
commonly
of
high
of these
SW:
stwur/i}2g weight
the
The
G WMT;
total
GWMT/ GDNT.
by the ratio
01
stemmer
be reversed.
{1}
the global wrongly-merged
we obtain
Sw = ol/ul Because
vt.
v.2, . . .
by the ‘wrongly-merged
by:
this
stemmer
the
GUMT\GDMT.
WMTs = 0.5 ~ i=l .t~i(~ts - vi) Summing
of
cases where the same stem occurs in two or more concept all cases of a particular stem into a ‘stem group’; now any
contains
that the numbers
overstemming
still Thus,
and that the numbers
of understemming
the global
b y the ratio
whose members are derived from errors which need to be counted.
Consider groups,
expect
of the groups
..~ui(ng - Ui)
we obtain
here is to gather
some
errors to be counted.
by
all groups
L.11is now given
that
stems after stemming,
Z.IMTg = 0.5 ~i=l Summing
to find
there are understemrning
The number
Z41, U2, .. . us.
LIMTg)
total’
we are likely
stems. In such groups
some kind
to the process
shorter
than
of baseline of length
~ being
left
against
which
tt’wmztioft
the general
accuracy
- that is, reducing
unchanged).
Length
every
truncation
of a stemmer word
may be
to just its first
is the crudest
method
q of
stemming, and we would obviously expect any rule-based or table-based stemmer to do better. Note however that length truncation refers to not just one but a series of stemmers, each with a different value
of ~; for IR purposes
truncation
lengths
The idea here is that if we determine
(LII,O 1) reasonable
coordinates stemmer
general, the further to be. Specifically, or
ERRT,
it intersects
define will
values
a tnuxation
give
the truncation
a ( LJ1, O1)
by extending
of
UI
li}ze against
away the point is from a performance measure
can be obtained
of 5,6 and 7 seem to be the most useful.
point
and
01
which
for a series of truncation any
between
stemmer
the truncation
lengths,
can be assessed. line
and
the Any
the origin;
in
the truncation line, the better the stemmer can be said which we may call the error rate relative to truncation,
a line from
line at T, as illustrated
the origin in Figure
ERRT = length(OP)/length(
1.
O through
ERRT OT)
.
the ( UI,OI)
is then simply
point defined
P until as
45
01
o
UI
Figure 1: computation
Experimental A suite
conversion
.-
computation
--
application values
of a truncation
LII,OI
of
stage where
program
makes
grouped
though
A question
should
partly
a matter
the alphabetic
7’ value
indices.
The
file; file;
algorithms
to the words
for each stemmer
human
effort
is involved
decisions
in the grouped
file,
its ( LII,OI)
by comparing
Adjacent
file thus
groups
produced
a second before
and
perhaps
once it is finished arises
only
over
what
even
to be correct.
third
the grouped to do about
‘barriers’
to obtain
point
to the
into
For one thing,
needs
The grouping the source.
cases refers
The to the
the file. some of the ‘obvious’
to be performed,
The grouping
file represents irregular
from
For another, alphabetic ordering may “read”, “readily”, “reading”, “readjust”, some of the more difficult groupings.
scan of the file
be satisfactory.
words
but for all uncertain
by inserting
in fact be wrong. groups (consider time to reconsider will
of the groups.
of all the distinct
are separated is not likely
the grouping
is in the construction
display
on the user’s behalf,
by the program may up certain conceptual the user may require
and “flew” should “went”? Stemming penalise it for not words
of the above
SW;
all the obvious
decisions taken sometimes split “reads”). Thirdly,
process,
stemming
to the user an alphabetic
user for the decision.
editor,
into a grouped
line for the grouped
of the ERR
presents
Consequently,
computation
line.
program
standard
and
to permit
parts:
of one or more
truncation The only
was written
into four
of a source text sample
computation
The
programs
falls broadly
.-
--
value
Arrangements
of computer
processing
of ERRT
verbs.
is thus
a permanent
a rather
resource
It seems natural
for future
and proper
using
a
laborious use. that
“fly”
be placed in the same group, but what about “is” and “were”, and “go” and relies on morphological regularities and similarities, so it seems wrong to merging totally dissimilar forms. In the event, a rule-of-thumb was used that be grouped
of convenience,
together
list. The rule obviously
but “buy” and “bought” it has only a tiny effect
if at least the first
since it is awkward
to bring
leads to anomalies
are not - but because the indices on the values of the indices.
two
letters
together - e.g., “bring”
were
the same.
This
was
words which are far apart in and “brought” are grouped,
are based on word
types rather
During the grouping process, the user is presented with individual, isolated given, and so there is no chance to allow for the different meanings of ambiguous
than tokens
words; no context words. In making
is a
46 grouping
decision,
underlying This
the user
concept, still
given
leaves
is in effect
a knowledge
the question
deciding
whether
of the general
whether
two
two
domain
words
which
typically
words
refer
to the same
of the source material. refer
to related
but not quite
identical
concepts should be counted as equivalent. It may be that taking a ‘strict’ view of semantic equivalence will give materially different results than taking a ‘loose’ view. To investigate this using two kinds of inter-group point, groups were actually defined at two levels of ‘tightness’, barrier.
First,
weakly
related
there
is a level
of possibly
to one another.
subgroups, each containing words means that for each stemmer which performance indices are generated. Some examples have been placed abstract
is
large,
Secondly,
which refer is evaluated
of the two-level grouping in individual subgroups
related
to
abstractness,
abstracting was assigned library science.
loose groups,
any loose group
to more-or-less against a given
other
subgroup
to
addition ) add, adds,
abstracts.
the domain
( additional, additionally adding, added, additive
During
explicated,
2:
examples
explication,
the
grouping,
was known
to be
)
) )
explications
)
)
( framework, frameworks
of two-level
and addition one sense of
)
( frame, frames, framing, framed ) Figure
tight
This approach separate sets of
abstract Thus,
of the source
( authur, author’s, authors, authorship ) .. ( authoritative ) ( authoriiy, authorities ( authoritarian ) authorized, authorization ) ( ( costly ) ( cost, costing, costed, costs) ( devise, devising, devised ) ( device, devices ) ( elementary ) ( element, elements, elemental ) explicate, explicates, explicit, explicitly )
may be quite
two or more
)
( alter, alters, altered, alterations ) ( alternate, alternately, alternating, alternations ( alternative, alternatives, alternatively ) ( appropriations ( appropriate, appropriately )
( (
which into
identical concepts. word sample, two
literature
because
abstract ) ( abstraction, abstractly ) abstracts, abstracting, abstracted, abstracters
( ( ( (
words
are shown in Figure 2. The words because they are both ambiguous.
the
to the latter
containing
may be subdivided
concept
)
groups
Words enclosed within parentheses are grouped tightly together. A horizontal line is a major barrier between adjacent loose groups. (All
Performance
rule
tables)
evaluations
are fully
and the Paice/Husk values
are actual
were
described stemmer
were also obtained
examples
carried
out
taken
for
in the IR literature: [3]. To provide
for simple
truncation
from
three
CISI source.)
stemmers
the Lovins
a baseline
whose
stemmer
for computing
using truncation
lengths
details
(including
[7], the Porter values
of ERRT,
of 4,5,6,
specific
stemmer
[8]
LII and 01
7 and 8.
A sample of words was obtained by processing all of the titles and abstracts in the CISI test collection, which is concerned with Library and Information Science. This source contained a total of 184,659 words, reduced to 9,757 after deletion of duplicates. Runs were also carried out using two smaller word samples: 1,527 distinct words (derived from 8,947 source words) from a textbook excerpt concerned with computer storage devices, and 3,559 distinct words (from 32,098 source words) from the texts of 14 papers on agriculture.
47 In order were
prepared
values
from
the influence
of sample
the CISI text source,
of 2, 4, 8 and
contained CISI
to investigate
16, and
then
size on the performance
by taking
preparing
every
grouped
7,304, 5,395, 3804 and 2,654 word
types
nth
files
line from as usual.
respectively,
indices,
four
complete
collection,
The resulting
compared
subsamples
with
with
word
n
samples
9,757 for the full
vocabulary.
Results Although
the values
patterns of values each
sample.
corresponding vary the
loose
were
markedly
different
the
values
obtained
for the and loose levels
we would
than
for tight
grouping
level.
since
----
results
expect.
for will
and this in turn
ER R T
The
obtained
Understemming
grouping,
loose
morphological
for the different
the same in all cases, so that similar
1 shows
understandable systematic
much
Table
in the ways
less, for loose
of the indices
were
semantic
values
CISI
sample. we find
obviously
be greater,
larger
with
are naturally
we
that
for
compare
the
all four
SW
grouping;
reflected
indices
values
this
in weaker
and
is less
similarities.
tight
grouping
----
----
----
SW
0.062
0.000814
0.013127
------
0.099
0.000706
0.007155
------
trunc(5)
0.176
0.000262
0.001487
------
0.258
0.000183
0.000710
------
trunc(6)
0.337
0.000073
0.000218
......
0.442
0.000022
0.000050
-.. -..
trunc(7)
0.527
0.000028
0.000054
------
0.633
0.000002
0.000004
......
trunc(8)
0.700 0.000012 0.000017 ------
0.780 0.000000 0.000000 ------
Lovins
0.326
0.000063
0.000193
0.92
0.459
0.000020
0.000044
1.00
Paice/Husk
0.121
0.000118
0.000978
0.55
0.257
0.000051
0.000197
0.67
Porter
0.374
0.000028
0.000074
0.76
0.542
0.000004
0.000007
0.88
performance
indices
01
ERRT
trunc(4)
Stemming
U/
grouping
SW
1:
ERRT
loose
01
find
for
too
UI
Table
the
be drawn
and overstemming
smaller
loose
samples,
could If
of grouping,
causes considerably
are
relationships
the
word
conclusions
for the CISI word
sample
the tight and loose sections of Table 1, we can If now we compare the pflttem of values within no marked differences, and this suggests that the properties and validity of the indices are not
strongly
affected
by the level of grouping
Comparing the three summarised as follows:
stemmers
- provided
with
Zll(Porter)
one another,
> ~~(Lovins)
O1(Paice/Husk) S ~(Paice/Husk) ERRT(Lovins)
presumably the relative
S W(Lovins)
> ERRT(Porter)
values
strategy
of the four
indices
is used. maybe
> LH(Paice/Husk)
> O~(Lovins) >
that a consistent
> O1(Porter) >
S W(Porter)
> ERRT(Paice/Husk)
Although marginal,
the magnitudes of the differences varied a good deal, and in a couple of cases were only the above inequalities actually held for all of the word samples tested. In terms of the
~1 index,
Lovins
was noticeably
closer to Porter
than to Paice/Husk.
48 If we take
ERRT
as a general
indicator
of performance
accuracy,
we would
have
to conclude
that Paice/Husk is a better stemmer than Porter, which is in turn better than Lovins. However, the differences in stemming weight between Paice/Husk and Porter are so great that it is probably meaningless to compare their accuracy: Paice/Husk is a heavy stemmer and Porter a light stemmer, and presumably It is helpful level.
This
not
also casts light the origin;
01.
this
However,
each is suited to look only
to a different
at Figure
3, which
highlights
the great
on the performance is as expected, a line
joining
task. plots
of Lovins.
given
Notice
the generally
the performance
concave towards the origin, suggesting other stemmers. This relationship holds samples tested .
01
~~ vs.
difference
for the CISI sample
in weight first
relationship
for Paice/Husk,
at the tight
Paice/Husk
that the truncation
inverse
points
between
and
grouping Porter,
line is convex
between Lovins
between
and Porter
it
towards L.H and is clearly
that Lovins is genuinely less accurate than either of the for both tight and loose levels of grouping for all the word
300 xlo
trunc(5)
-6
250
200
150
100
50
I
1
I
0.1
0.2
1
0.3
+
1
0:4
0:5
UI
Figure 3: UI x 01 plot for CISI sample (tight grouping)
Lennon
et al. represented
the weights
of their
stemmers
by the dictionary
compression
each
could achieve. Their results for Lovins and Porter are compared with ours in Table 2. Our results from the CISI sample are closest to theirs from the Brown linguistic corpus; oddly, this appeared to be the least similar
source
to ours.
However,
our ratio
for the compression
by Lovins
compared
to
49 Porter
was
1.14, whilst
clearly
that Porter
theirs
were
is a lighter
all in the range
stemmer
than
1.13 to 1.18. Our
Lovins,
and also that Paice/Husk
--- this work ---
-–
-–
tight groups
5,101
47.7
-–
-–
-–
-–
loose groups
4,350
55.4
---
-–
-–
-–
Lovins stems
5,409
44.6
45.8
39.2
39.5
30.9
Paice/Husk
4,755
51.3
-–
-–
-–
-–
5,964
38.9
38.8
34.6
33.8
26.2
turn
of the concept
that
groups,
Porter
Dictionary
reach
compression numbers
other
are percentage
values
of sample
is fairly
a minimum
recall
at about
can be expected that
the global
the average
as
likely that as the source related singleton words.
U1, representing
size. 01 however
sample
words,
grows,
The global
of the word
G WMT
or no tendency
the fresh
structure
SW) falls
(and consequently
tendency: whereas
the values Paice/Husk
for Lovins decreases
less fast, and
will
non-merge
this
with
W; this is plausible
increasingly
tend
total
therefore
in turn
implies
to W. This can be explained
to increase
introductions
desired
size W. We may
sample
increases
{1} is less than proportional little
the internal
10,000 words.
G WMT/GDNT.
total
of formula
size shows
that
rates.
a less consistent
on the square
wrongly-merged
value
stem-group
to depend
5,000
off around
01 is defined
that
compression
to sample
ERRT shows
but shows signs of levelling
may
of items;
size. It appears
insensitive
size increases.
heavier.
by three stemmers.
n represents
to the effect
as sample
throughout,
average
2:
confirm
Cranfield
---
off sharply
GDNT
Inspec
---
We now
infer
N PL
100.0
Table
We
Brown
9,757
Porter stems
and
is much
sample words
stems
values
--- Lennon et al. 1981 ---
Clsl
n
compression
if the
since it is
to be non-domain-
Findings The general ---
results
The specific
of our experiments values
For a particular values
maybe
of the performance
word
source,
of 01 and S W fall
summarised indices
the values
off sharply
vary
as follows: markedly
of U1 are fairly
as sample
depending
insensitive
size increases.
on the source to sample
ERRT
values
text.
size, whereas show
modest
fluctuations. .-
In terms
of the stemming
SW,
Porter
is a light
between
Porter
and Paice/Husk
weight
stemmer
and
Paice/Husk
a heavy
stemmer. --
The difference meaningless
in the SW value to compare
--
The Lovins
stemmer
--
The choice
of grouping
strategy
their
is so great
that it is probably
performance.
seems to be generally
less accurate
level does not appear
than either
to be a critical
matter,
of the other provided
two stemmers. that a consistent
is used in each case.
The lightness of the Porter stemmer is in agreement with earlier significantly lighter than Lovins and several other algorithms [1].
findings
that
Porter
is
50
Final
Comments
The author
is well
aware
of various
doubts
and problems
with
the methods
described
in this
paper. Further work is clearly needed to explore the validity of the approach and to make the programs more useful. One area of difficulty concerns the subjective and fuzzy nature of the grouping operations, and it would he valuabl~ to hav~ som~ objective evidence to assist in this activity. Rather
than
tagged
words
would
appear
word
basing could
the grouping
on a display
be presented
instead.
to be of greater
type might
be assigned
value
[9]. Use of tagged
to different
Use of these stemmer evaluation limited to comparing the performance optimizing
the rule-tables
depend on the nature algorithm stem a little of the contextual With discover
words
groups
it might
which
text would
of course
on different
sk, the emphasis
would
use. The types of changes
be on reducing
some of the more
to compute
considered
the number
rules.
trOUblWOIIW
cause of overstemming
individual
be interesting the
and
dictionary
Acknowledgements.
indices
tags
a particular
would
of course
to make Porter’s or relaxing some
of
errors.
method the
of overstemming
In this case, it would This could
for each different
this evaluation
significance
errors
by
be desirable
to
be done by modifying
the
used to remove it) so that this are being counted. It should then
ending.
to dictionary-based
performance
gap
conflation
between
in order
a well-optimised
operation.
I should
work
error
to apply
extent
and a full
invaluable
that
tools (whether in the existing or an enhanced form) is not of existing off-the-shelf stemmers: they can also be used for
the stemmers
rules are a serious
to investigate stemmer
mean
occasions.
stemmer to keep a note of each removed ending (or the rule information is available when the under- and overstemming errors
It would
if grammatically
be of some use, but semantic
of the stemmer in question. For example, we might try more heavily by adding additional rules or by modifying
or modifying
be possible
be better
tags might
constraints.
Paice/Hu
retracting
which
of isolated
Part-of-speech
like to thank
in cieveloping
various
Gareth
Husk,
Chris
Danson
and Helen
Simpson
for their
parts of the software.
References 1. Lennon, M., Pierce, D. S., Tarry, B. D., Willett, JoIirwd of I}zforIHathH information retrieval. 2. Frakes, 198~.
W.B.
3. Paice, C.D. 4. Hafer,
Trrw
Another
M.A.
Storage
CoHj7afiott
fl}ui
and
for
Weiss,
I}I~orwIation
Retrimnl.
Ph.D.
thesis,
Syracuse
algorithms
University,
for
NY,
SIGIR for’?~m 1990; 24,56-61.
stemmer.
Retrleml
P. An evaluation of some conflation SCimcr 1981; 3{ 177-183.
S.F.
Word
segmentation
by
letter
successor
varieties.
lnforvvrtion
1974; 10, 371-385.
5. Landauer, C. and Mah, C. Message extraction through estimation of relevance. In: R. N.Oddy d. (Ects.), Irrformtiotl Retmxd Resem’ch. Lomion: Butterworths, London, 1981, pp.117-138. 6. Harman, D. How 1991; 42, 7-15. 7. Lovins,
J.B.
Linguistics 8. Porter,
M.F.
9. Wilson,
progress.
effective
Development
is suffixing?
~o?trml
of a stemming
of the Awericm
Society for
l}lforvlfltiotl
Met)/a}~id
Trmdafio}z
a)zd Coqmtatiomd
algorithm.
et
Science
1968; 11, 22-31. An
A. and
algorithm Rayson,
In: Souter, Amster~iam & Atlanta
for P. The
suffix
Program
stripping.
automatic
C. and Atwell, GA, 1993.
content
A.,
analysis
Corpus-based
1980; 14, 130-137. of spoken
discourse:
Covyri~hztiowd
a report
Linguistics.
on work
Rodopi,
in