Data Clustering

Example of how data clustering promotes clarity, from a PhD thesis by Matthias Scholtz

In many applications, data arrives in a constant stream, such as telephone records, multimedia data and financial transactions. In data streaming algorithms, the goal is to use a sequence of points to construct an accurate clustering of the given stream while being efficient with memory and time usage. Common components, version and heuristics for these algorithms include:

CURE - especially good for non-uniform clusters and outliers by choosing a middle ground and shrinking scattered points toward it
BIRCH - incrementally clusters incoming points by constructing a hierarchical data structure that minimizes input and output required
STREAM - needs only a small space to achieve constant-factor approximation of the k-median problem in one pass
COBWEB - incrementally clusters using a classification tree as its hierarchical clustering model
C2ICM - selects objects as cluster seeds and assigns non-seed objects to the seed with the highest coverage to construct a flat partitioning cluster structure

To clarify the notation, k is an integer here where the input of the stream clustering is a sequence of points in metric space and k. The resulting output is called the K centers of that set of points where the sum of the distances from the data points themselves and the centers of the clusters is minimized. This notation gives popular cluster-analysis techniques like k-means and k-medoids clustering their names.

Clustering is important to do as it helps develop models and patterns from masses of seemingly patternless data. With the rise of data mining, the field has blossomed with research.

Icon credit EEPROM Eagle

BrainMass Solutions Available for Instant Download

Unix and Sleuth

This project needs to be done on a UNIX machine using the Sleuth forensic tools. If you are using your own machine, you need to install the Sleuth Kit forensic tools (http://www.sleuthkit.org) on your machine. This week, you need to use the Sleuth tools to carry out the following tasks on the FAT undelete image from http://d

Database Constraints

Business requirements are enforced by implementing database constraints on tables and columns. The database constraints available include the following. PRIMARY KEY FOREIGN KEY or REFERENTIAL INTEGRITY NOT NULL UNIQUE CHECK Give a business requirement and the constraint that could be implemented to enforce it. Ex

Creating an SQL Server Database

Suppy SQL server data types when creating tables for the employees table, create an employee ID file that generates a unique number for each employee and designates the field as a primary key in the job title table.

Need of Apache Hadoop and its uses.

What is the need for Apache Hadoop and explain its uses?

Computer Networks

Explain the benefits of network segmentation. Describe the different transport mechanisms included with TCP/IP. Explain each mechanism's approach for connections establishment and termination. Enumerate the applications that use TCP, the ones that use UDP, and the reasons why they use one or the other.

Text Editor: Create a File on a Newly Formatted Floppy Disk

1. Using a text editor, create a file that is between 5,000 and 6,000 bytes long on a newly formatted floppy disk. Calculate the file directory and FAT entries for the type of disk used, and check your calculations using absolute sector program sector.asm. 2. Write a program that will perform the DOS DIR command for a disk

File Systems used by Windows

Please compare and contrast the various file systems used by the different versions of Windows.

Web Design Standards

Search the Internet for two Web sites relating to Web design standards. Complete the following in your discussion cluster: Create a list of 10 Web design standards with your cluster. To create this list, discuss the standards on each site and determine the 10 your cluster feels to be the most appropriate for effective Web d

Working with tables in MS SQL server 2005

Need to know how to create a table with specified columns in SQL Server 2005. How can we add Foreign Key constraints ? How can we add Check constraints on particular columns ? How can we create new View on one or more tables ?

SQL statements and databases

This must be done in SQL Server 2005. In the first exercise, the Class field in the Part table should be a string of size 5, and not an int. 1- Write a statement that creates a table named Part, with an Id field as an int idendity primary key(PK), a SupplierId int field, a Description string field of size 25, a Count int f

Database Usage Memorandum

Please help me so I can complete the following: I have to prepare a two to three page memorandum (350 words per page) analyzing the use of databases in my organization. Include what database software are used (Microsoft, Informix, Oracle, etc.). Conclude by proposing improvements. For large organizations, restrict the scop

Windows 2000 Benefits Over Windows 2003

Discuss it's features, pros, cons, compatibility, price, etc. Why choose Windows 2000 over 2003?

Server-Side Scripting Languages

Need assitance with the attached problem. There are several server- based scripting languages available offering wide array of features and complexity to web engineers. Briefly review the emergence of such languages and recommend any two as your development tool. Does Perl offer any unique feature compared to JavaScript and P

Networking

Please help in answering the 2 questions below. Thank you. 1. Network Media XYZ Corp. is planning a new network. Engineers in the design shop must be connected to the accountants and salespeople in the front office, but all routes between the two areas must traverse the shop floor, where arc welders and metal-stamping equipm

Fault Tolerance and Backups

What is the difference between fault tolerance and disaster recovery? How does a network administrator decide which backup method to implement?

OLAP versus RDBMS

Explain what one can do with an OLAP application that cannot be done with the same data in a spreadsheet or a relational database. Give two examples.

Ebay database does not have referential integrity.

Though there is great risk in implementing a database in this fashion Ebay gained an extreme performance boost because the database didn't have to work as hard to ensure that the data "conformed." So my question is, if the database does not have referential integrity to keep the data clean, how does Ebay ensure that the data

Data Mining

Differentiate between the following terms: A. Independent data mart and dependant data mart B. Fact table and dimension table C. OLTP and OLAP Chapter 7 1. Differentiate between the following terms: A. Validation data and test set data B. Positive correlation and negative correlation C. Control group and experi

If someone were to have a neural network that could scan information on all aspects of your life, where would that neural network be able to find information about you. (DMV, doctor's office)

If someone were to have a neural network that could scan information on all aspects of your life, where would that neural network be able to find information about you. (DMV, doctor's office) What kind of patterns might the neural network show from sources like , my MD's office, the DMV?

Auditing, Dishonest Employees and Roles

1. How would auditing help you find a dishonest employee? 2. What business applications might find roles useful?

Oracle9i

1. You want to make a report of table attributes. This report consists of a series of queries on data dictionary views in which you specify the table name, and the queries return details about the table. Your goal is to have information on the report that is similar (in content, not format) to the informaiton you see when you

Server Protection

Your team has been hired by a large restaurant called Habibi's. Habibi's has now grown into a national chain with hundreds of locations. Each location has one Windows Server 2003 and many Windows XP desktop computers. Your job is to set up the Windows Server 2003 so as to standardize operations so that software can be automa

Effectiveness of International Law on Computer Crimes

There is currently no international law governing computer crime. Answer the following questions: - Should there be an international law governing computer crime? - What should be the penalty for which crimes? - Who should enforce these laws?

Information Security (6 multiple choice) questions

Is there anyone that can help me be better understand the 6 multiple choice questions that are attached. Please help if you can. I feel that 5 credits for these questions is a very reasonable compensation for the review. There are 6 multiple choice questions which I have answered. I am requesting someone with knowledge i

Exporting and Importing Data

Discuss different methods of exporting and importing, with an emphasis on efficiency and avoiding data corruption or misplacement.

SQL Server 2000 Databases Management

1. What is the difference between complete and differential backups? 2. Explain the meaning of each of the transaction levels supported by SQL Server. 3. Explain the difference among the simple, full, and bulk-logged recovery models. 4. What is the difference between clustered and nonclustered indexes?

Windows Server 2003: SCSI Adapter and Disk Storage Unrecognized

Let's say you start a Windows 2003 Server installation and the SCSI adapter and the disk storage attached to the adapter aren't recognized. What could be the cause and what steps can you take to solve the problem?

Use the Internet or computer magazines to investigate one of the following DBMSs

Use the Internet or computer magazines to investigate one of the following DBMSs: DB2, SQL Server, MySQL, Oracle, or Sybase. Then prepare a report that explains how the DBMS handles tow of the following distributed database functions: deadlock, fragmentation, replication, the data dictionary or log, and distributed queries.

SQL Sample Database Queries

I am seeking help with solving several SQL statements. I am specifically looking for the code that goes with these problems. 6) A wide world importers company tracks its order information in a database that includes two tables: Order and LineItem. See table structures below: CREATE TABLE dbo.Order ( OrderID int NOT NULL,

Computer Human Interaction

Go out to (http://www.open-video.org) and find the video clip about digital jewelry by typing chi in the search field( it should be on page 4). Write up a brief (3-5 paragraph) summary of what the video clip is demonstrating or what problem it is trying to solve. Be sure to identify the target audience and discuss whether the s