The relevancy scores may be assigned based on the proportion of comments in which a subjective attribute appears (such as a score of 0.5 for “cute” and a score of 0.2 for “awesome,” etc.).
The entity store may be hosted by storage devices, such as main memory, magnetic or optical storage-based disks, tapes or hard drives, NAS, SAN, etc.
In one aspect, subjective attributes for a first media item get identified based on a reaction to the first media item, and relevancy scores for the personal qualities with about the first media item get determined.
The network may be public (such as the Internet), a private network (such as a local area network (LAN) or vast area network (WAN)), or a combination thereof.
For example, when subjective attribute identifier has identified the subjective attributes “cute”, “funny”, and “awesome” based on comments to a media clip posted on a social networking website, relevancy scorer may determine relevancy scores for each of these three subjective attributes based on:
The frequency with which these subjective attributes appear in comments
The particular users that provided the subjective attributes
So forth
The results get provided to the feature extractor.
The personal attribute identifier may identify subjective attributes via text processing of users’ comments to an entity posted by a user on a social networking website.
The output has relevancy scores for each subjective attribute in the vocabulary for the particular entity.
The method gets performed, while in some other implementations may get performed by another machine.
The training set gets initialized to the empty set. An entity E gets selected and removed from the set S of entities.
The relevancy scores may be assigned based on the proportion of comments that a subjective attribute appears in (such as a score of 0.5 for “cute” and a score of 0.2 for “awesome,” etc.).
A browser (not shown) may run on the client machines (such as on the OS of the client machines).
The subjective attribute vocabulary may be organized hierarchically, possibly based on “meta-attributes” associated with the personal attributes (such as the personal attribute “funny” may have a meta-attribute “positive,” while the subjective point “disgusting” may have a meta-attribute “negative,” etc.).
The feature extractor obtains a feature vector for an entity using techniques such as:
Principal components analysis
Semidefinite embeddings
Isomaps
Partial least squares
So forth
The entities stored in the entity store may include user-generated content that gets uploaded by client machines and may include content provided by service providers such as:
News organizations
Publishers
Libraries
So on
A classifier (such as a support vector machine [SVM], AdaBoost, a neural network, a decision tree, etc.) gets trained by providing a set of training examples, where the input for a training example comprises a feature vector obtained from a particular entity (such as a feature vector for a video clip.
Abstract:
The subjective attributes and relevancy scores are associated with the entities (such as via tagging, entries in a table in a relational database, etc.).
For simplicity of explanation, the method gets depicted and described as a series of acts.
The client machines may run an operating system (OS) that manages the hardware and software of the client machines.
Datastore may be hosted by storage devices, such as main memory, magnetic or optical storage-based disks, tapes or hard drives, etc.
.
The default relevancy score maybe 1.0 on a scale from 0.0 to 1.0, the default relevancy score may be based on the particular user (such as a score of 1.0 when the user is known from past history to be very good at suggesting attributes, a score of 0.8 when the user is known to be somewhat good at suggesting attributes, etc.).
Subjective attribute identifier may identify subjective attributes for entities based on other types of user reactions to entities, such as:
‘Like!’ or ‘Dislike!’
Sharing the entity
Bookmarking the entity
Adding the entity to a playlist
So forth
The term article of manufacture, as used herein, gets intended to encompass a computer program accessible from any computer-readable device or storage media.
When the set of positive examples and the set of negative criteria are unequal in size, the more extensive set may get sampled to match the size of the smaller group.
The client machines may be wireless terminals (smartphones, etc.), personal computers (PC), laptops, tablet computers, or any other computing or communication devices.
Additionally, it should get appreciated that the methods disclosed in this specification are capable of getting stored on an article of manufacture to ease transporting and transferring such methodologies to computing devices.
It should get noted that when an entity is an actual physical entity (such as a person, a restaurant, etc.), the pre-processing of the entity gets performed via a “cyber proxy” associated with the physical entity (such as a fan page for an actor on a social networking website, a restaurant review on a website, etc.); but, the subjective attributes get considered to get associated with the entity itself (such as the actor or restaurant, not the actor’s fan page or the restaurant review).
The entity is then associated with the identified subjective attributes and relevancy scores (such as via tags applied to the entity, via entries in a table of a relational database, etc.).
An example of a method for performing get described in detail.
An implementation of a first example method is described in detail below, and the performance of a second example method is described.
The subjective attribute manager:
Identifies subjective attributes for entities based on user reaction (such as comments, Like!, sharing, bookmarking, playlisting, etc.)
Determines relevancy scores for subjective attributes about entities
Associates subjective attributes and relevancy scores with entities
Extracts features like image features such as color, texture, and intensity; audio features like amplitude, spectral coefficient ratios; textual features like word frequencies, average sentence length, formatting parameters; metadata associated with the entity; etc.) from entities to generate feature vectors
Trains a classifier based on the feature vectors and the subjective attributes’ relevancy scores
Uses the trained classifier to predict subjective attributes for new entities based on feature vectors of the new entities
A Subjective Attribute Manager
A relevancy score is determined for a particular subjective attribute based on the frequency with which the personal attribute appears in user comments, the specific users that provided the subjective details in their words (such as some users may be known from experience to be more accurate in their comments than other users, etc.).
A computer readable storage medium has instructions that cause a computer system to perform operations including:
Identifying a first set of subjective attributes for a first entity based on a reaction to the first entity
Obtaining a first feature vector for the first entity
Training a classifier on a set of input-output mappings, wherein the set of input-output mappings comprises an input-output mapping whose input gets based on the first feature vector and whose output gets based on the first set of subjective attributes
Obtaining a second feature vector for a second entity
Providing to the classifier, after the training, the second feature vector to get a second set of subjective attributes for the second entity
The particular user associated with the location (such as a specific user may be an authority on classical music and thus comments about an entity in their newsfeed may get weighted more than comments in another newsfeed, etc.), non-textual user reactions (such as “Like!”, “Dislike!”, “+1”, etc.).
If the subjective attribute “cute” appears in a significant proportion of comments for a video clip, then “cute” may get assigned a high relevancy score.
In some other aspects these computations get performed by another entity, such as an Executable library of:
Image processing routines hosted by server machine [not depicted in the Figures]
Audio processing routines
Text processing routines
Etc
A feature vector for entity E gets generated.
We often see user gnnerated content in the form of comments, such as:
A Comment by a first user about content shared by a second user within a social network
User comments in response to an article in a columnist’s blog
A comment from a video clip posted on a content hosting website
Reviews (such as of products, movies)
Actions (such as Like!, Dislike!, +1, sharing, bookmarking, playlisting, etc.)
So forth
A set S of entities (such as all the entities in the entity store, a subset of entities in the entity store, etc.) is pre-processed.
The server may serve web pages and content from the entity stores to clients.
The entity store is persistent storage that is capable of storing entities such as media clips (such as video clips, audio clips, clips containing both video and audio, images, etc.) and other types of content items (such as webpages, text-based documents, restaurant reviews, movie reviews, etc.), as well as data structures to tag, organize, and index the entities.
The server machine includes a web server and a subjective attribute manager. The web server and emotional attribute manager may run on different devices.
By some aspects, the subjective attributes identified may be discarded based on their relevancy scores (such as retaining the k emotional attributes with the highest relevancy scores, discarding any personal attribute whose relevancy score is below a threshold, etc.).
For example, suppose the personal attribute identifier identifies seven emotional attributes that appear in user comments at least three times. In that case, the relevancy scorer may, for example, retain only the five subjective attributes with the highest relevancy scores and discard the other two emotional attributes (such as by setting their relevancy scores to zero, etc.).
This patent can improve the quality of entity descriptions, such as tags for a video clip, improving the quality of searches and the targeting of advertisements.
A System Architecture To Identify Subjective Attributes
A relevancy score is a natural number between 0.0 and 1.0 inclusive.
Under one aspect, pre-processing of the entities comprises identifying user reactions to the entities and then training a classifier based on the responses.
When An Entity Is An Actual Physical Entity
The method is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as gets run on a general-purpose computer system or a dedicated machine), or both.
The vocabulary gets seeded with a small number of subjective attributes expected to apply to entities. The vocabulary gets expanded over time as more terms or phrases that appear in user reactions get identified via automated processing of the responses.
The above procedure is performed for each entity in a given set of entities (such as video clips in a video clip repository, etc.), and an inverse mapping from subjective attributes to entities in the group is generated based on personal qualities and relevancy scores.
The relevancy scorer may keep only the k most relevant subjective attributes and discard other personal attributes.
The browser may be a web browser that can access web pages and content served by a web server.
The set of positive examples for a subjective attribute may include feature vectors for entities associated with that particular personal attribute.
The different locations may get weighted in their contribution to relevancy scores based on a variety of factors, such as a:
The personal attribute identifier identifies subjective attributes for entities based on user reaction to the entities.
Execution continues back.
But acts can occur in various orders and and with other acts not presented and described herein.
I haven’t seen a patent about subjective attributes of entities or responses to those entities.
After training, the classifier may predict subjective attributes for other entities not in the training set by providing feature vectors for these entities as input to the classifier.
For example, if there are 40 comments and “cute” appears in 20 words and “awesome” appears in 8 comments, then “cute” may get assigned a relevancy score that is higher than “awesome.”
Under one aspect, the relevancy scores get normalized to fall in intervals [0, 1].
It starts with:
Identifying a first set of subjective attributes for a first entity based on a reaction to the first entity (such as comments on a website, a demonstration of approval of the first entity (such as “Like!, etc.)
Sharing the first entity
Bookmarking the first entity
Adding the first entity to a playlist
Training a classifier (such as a support vector machine, AdaBoost, a neural network, a decision tree on a set of input-output mappings, where the set of input-output mappings comprises an input-output mapping whose input is Providing a feature vector for the first entity, whose output gets based on the first set of subjective attributes
Providing a feature vector for a second entity to the trained classifier to get a second set of subjective attributes for the second entity
A feature vector for entity E gets generated. The trained classifier provides the feature vector to get predicted subjective attributes and relevancy scores for entity E.
It may contain numerical values about:
Color
Texture
Intensity
Metadata tags associated with the video clip
Etc
The computations associated with extracting features of an entity get performed by the feature extractor itself.
This patent can classify entities according to subjective attributes such as “funny,” “cute,” etc. based on user reaction to the entities.
This may get performed by the feature extractor.
Sharing is caring!
Identifying UGC Subjective Attributes Of Entities
The personal attribute identifier may apply thresholds to determine which attributes are associated with an entity (such as a subjective attribute should appear in at least N comments, etc.).
The method gets performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software, or a combination of both.
The data store may be the same as the entity store or a different data store (such as a temporary buffer or a permanent data store) to hold a personal attribute vocabulary, entities that are to get processed, feature vectors associated with entities, personal attributes and relevancy scores related to entities, or some combination of these data.
The classifier is a learning machine (such as support vector machines [SVMs], AdaBoost, neural networks, decision trees, etc.) that accepts as input a feature vector associated with an entity and outputs relevancy scores (such as an actual number between 0 and 1 inclusive, etc.) for each subjective attribute of the personal attribute vocabulary.
The feature vector and the relevancy scores obtained get added to the training set.
For example, if there are 40 comments and “cute” appears in 20 words and “awesome” appears in 8 comments, then “cute” may get assigned a relevancy score that is higher than “awesome.”
Various components of subjective attribute managers may run on separate machines (such as personal attribute identifier and relevancy scorer may run on one device while feature extractor and classifier run on another device, etc.).
Ways for identifying and predicting subjective attributes for entities (such as media clips, images, newspaper articles, blog entries, persons, organizations, commercial businesses, etc.).
Entity E gets stored as a negative example of the removed attribute(s) for future re-training of the classifier. The refined set of subjective attributes and corresponding relevancy scores are associated with entity E (such as via tags applied to entity E, via entries in a table of a relational database, etc.).
A classifier gets trained using (i) a training input comprising a set of features for the first media item and a target output for the training input, the target output comprising the respective relevancy scores for the subjective attributes of the first media item.
Identifying And Predicting Subjective Attributes For Entities
As described above, the feature vector for a video clip or still image may contain numerical values about color, texture, intensity, etc.. In contrast, the feature vector for an audio clip (or a video clip with sound) may include numerical values about amplitude, spectral coefficients, etc.. In contrast, the feature vector for a text document may include numerical values about word frequencies, average sentence length, formatting parameters, and so forth.
The components can get combined or separated into further details.
The set of negative examples for a subjective attribute may include feature vectors for entities that have not gotten associated with that particular personal attribute.
Relevancy scores for the subjective attributes get determined by entity E.
This recently granted patent is about identifying subjective attributes of entities.
A set of positive examples and negative criteria are assembled for each subjective attribute in the personal attribute vocabulary.
Furthermore, not all illustrated acts may get required to install the methods by the disclosed subject matter.
In addition, those skilled in the art will understand and appreciate that the method could be represented as a series of interrelated states via a state diagram or events.
A critical aspect of it is that it is user-generated content.
The subjective attribute manager notifies users of the types of information stored in the data store and entity store and allows users to choose not to have such information collected and shared with the subjective attribute manager.
The Subjective Attribute Identifier
The system architecture includes a:
Server machine
Entity store
Client machines are connected to a network
Under this patent, a way to identify and predict subjective attributes for entities (such as media clips, images, newspaper articles, blog entries, persons, organizations, commercial businesses, etc.) gets provided.
The classifier may include multiple classifiers (such as a classifier for each subjective attribute in the personal attribute vocabulary, etc.).
The client machines may also upload:
Web pages
Media clips
Blog entries
links to articles
So forth
A default relevancy score gets assigned to any new subjective attributes that got added by the user.
The block gets performed by subjective attribute identifier.
Subjective attributes for entity E are identified based on user reactions to entity E (such as user comments, Like!, bookmarking, sharing, adding to a playlist, etc.).
The trained classifier provides the feature vector to get predicted subjective attributes and relevancy scores for entity E.
The classifier gets trained on all the examples of the training set, such that the feature vector of a training example gets provided as input to the classifier, and the subjective attribute relevancy scores get provided as output.
Obtaining Subjective Attributes And Relevancy Scores For An Entity
The subjective attribute manager may be the same as the subjective attribute manager and may include a:
Subjective attribute identifier
Relevancy scorer
Feature extractor
Classifier
Data store
The bock checks whether the set S of entities is empty; if S is non-empty, execution continues, otherwise execution proceeds.
User reactions may get aggregated for entities that occur in many locations, such as:
Entities that appear in many users’ playlists
Entities that have gotten shared and appear in a plurality of users’ “newsfeeds” on a social networking website
Etc
Subjective attributes (such as “cute,” “funny,” “awesome,” etc.) get defined, and subjective attributes for a particular entity get identified based on user reaction to the entity, such as:
Comments on a website
Like!
Sharing the first entity with other users
Boomarking the first entity
Adding the first entity to a playlist
Etc
Relevancy Scores For The Subjective Attributes Get Determined About The Entity
The identification of subjective attributes includes performing processing of user comments, such as by:
Matching words in user comments against subjective attributes in the vocabulary
Combining word matching and other natural language processing techniques such as syntactic and semantic analysis
Etc
Entities that Occur Near Locations
Subject attributes and relevancy scores for entity E get obtained.
The predicted subjective attributes obtained get suggested to a user (such as the user who uploaded the entity. A refined set of personal attributes is obtained from the user, such as via a web page in which the user selects from among the suggested attributes and possibly adds new attributes, etc.).
A Default Relevancy Score For Entities
We get told that user-generated content (UGC) is becoming more common on the Web because of the increasing popularity of social networks, blogs, review websites, etc.
The method is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as gets run on a general-purpose computer system or a dedicated machine), or both.
The entity store might get hosted by a network-attached file server. In contrast, in other implementations, the entity store may get hosted by some other type of persistent storage such as that of the server machine or different machines coupled to the server machine via the network.
It should be noted that a subjective attribute may be discarded by setting its relevancy score to zero in some aspects.
Subjective Attributes And Relevancy Scores Are Associated With The Entities
The method gets performed by the server machine, while some other implementations may get performed by another device.
A memory and a processor get provided to identify and predict subjective attributes for entities.
The classifier consists of a single classifier.
The Block branches get based on whether the user removed any of the suggested subjective attributes (such as by not selecting the attribute).
The trained classifier can then predict subjective attributes for entities not in the training set (such as a newly-uploaded video clip, a news article that has not yet received any comments, etc.).
The inverse mapping can then get used to identify all entities in the set that match a given subjective attribute (such as all entities that have gotten associated with the subjective attribute “funny”, etc.), thereby enabling:
Rapid retrieval of relevant entities for processing keyword searches
Populating playlists
Delivering advertisements
Generating training sets for the classifier
So forth
A set of subjective attributes may get obtained from the classifier’s output by including all emotional attributes with non-zero relevancy scores. A group of subjective points may be obtained by applying the most minor threshold to the numerical scores (by considering all personal attributes that have a score of at least, say, 0.2 as being a member of the set).
Identifying Subjective Attributes Of Entities
It should get noted that the classifier may be re-trained (such as after every 100 iterations of the loop, every N days, etc.) by a re-training process that may execute concurrently.
Pre-Processing A Set Of Entities
A vocabulary of subjective attributes gets generated.
The method gets performed by the server machine, while some others may get performed by another device.
In one aspect, the feature vector for a video clip or still image may contain numerical values about color, texture, intensity, etc., while the feature vector for an audio clip (or a video clip with sound) may include numerical values about amplitude, spectral coefficients, etc., while the feature vector for a text document may include:
Numerical values about word frequencies
Average sentence length
Formatting parameters
So forth
The predicted subjective attributes and relevancy scores get associated with entity E (such as via tags applied to entity E, via entries in a table of a relational database, etc.).
A Second Method For Obtaining Subjective Attributes And Relevancy Scores For An Entity
A system and method for identifying and predicting subjective attributes for entities (such as media clips, movies, television shows, images, newspaper articles, blog entries, persons, organizations, commercial businesses, etc.) get disclosed.
In some aspects, the subjective attribute vocabulary may get defined. In contrast, in some other factors, the personal attribute vocabulary may get generated in an automated fashion by collecting terms and phrases that get used in users’ reactions to entities. In contrast, in yet other aspects, the vocabulary may get generated by a combination of manual and automated techniques.
In addition, the number of locations where the entity appears may also be used in determining subjective attributes and relevancy scores (such as relevancy scores for a video clip may be increased when the video clip is in hundreds of user playlists, etc.).
A feature vector for entity E gets obtained.
The subjective attributes and relevancy scores obtained are associated with entity E (such as by applying corresponding tags to the entity, adding a record in a relational database table, etc.).
Atn entity E that is not in set S is received (such as a newly-uploaded video clip, a news article that has not yet received any comments, an entity in entity store that was not included in the training set, etc.).
This patent on dentifying subjective attributes for entities =is found at:
The relevancy scorer determines relevancy scores for subjective attributes about entities.